from Roscoe's Quick Notes

Oof! At least, now it is mainly yard work. For the last 4 ½ hours it's been yard, and street, and sidewalk work as I busy myself cleaning up the mess left by fallen branches from that big tree in my front yard. Now that I have the street and sidewalk clear, and my big green organics bin already filled. I'll be cutting the bigger branches into smaller pieces and dragging them around to a back yard (or side yard) staging area where they'll wait until the city picks up the green bin this Thursday and I can load it up again.

And the adventure continues.

 
Read more...

from Ernest Ortiz Writes Now

In my previous post, My Red Phone, Notepad, and Pencil, I talk about my three main writing tools always in my pocket. But I never identified them. Now that I started another red Blackwing notepad and this is my first post about it, I’ll give you my thoughts about it.

Note: I’m not affiliated with any products or services I use. No links will be provided.

At first glance, the red glossy cover with the etched image of the Golden Gate Bridge feels smooth but doesn’t slip out of my hands. The stitching is durable and I never had trouble bending the spine. Nor I had problems with pages breaking or falling off.

Inside of the front and back cover is blank which is good so I can write anything on it. I put my contact information and a table of contents. One of my pet peeves on some notepads (Moleskine Cahiers) is having perforations on the last few pages. I hate those! And I shouldn’t have to tear them off or tape them together.

Since I only write in wooden pencil I do see graphite transfer and smearing just like any other notepad. But it writes well. The pages are cream colored so it’s easy on my eyes. I can’t say how it handles pen. I’m sure you can find another reviewer who writes in pen.

Finally, it fits well in my back pocket and it’s durable even while sitting on it everyday. It’s always ready for me to jot down my blog drafts. Now, as for the price.

It costs $18 before tax for a pack of three. $9 each notebook and with 48 pages each you pay about $0.19 a page. It’s pricey, but at least they’re durable. Would I buy this again? No, there are more cheaper options. If you ever get them, they don’t disappoint for whatever your writing needs.

Let me know your thoughts if you used them.

#writing #746 #Blackwing #notepad

 
Read more... Discuss...

from Lastige Gevallen in de Rede

Post Post Pakket Depressie

Nieuwe tijden, nieuwe ellende, het begint altijd zo mooi. Winkels waren er, eerst fysiek, vast verpakt in steen en glas, en vervolgens thuis aangelijnd, in de IT. We kunnen daardoor thuis de kleine en grote boodschap bestellen en ontvangen, we zijn terecht gekomen in een zeer vreugdevolle periode, we hoeven niet langer te verdwalen in de enorme super, niet meer naar schimmige achterbuurten op zoek naar een seks video of speeltje, neen, het kan van ver naar u komen in een pottenkijker dichte doos of als het een televisie, computer, koelkast of stereoset betreft in een doos waarop het merk en type wel 1000 x staan vermeld, zowel op de voor en achterkant van de doos in koeienletters, u nieuwe Kijk Doos van Pillips komt schaamteloos uit het Cool Blue, PostNL of DHL busje.

En anders dus in een ruimte besparend doosje uit China, het product gepropt in een ruimte waar het onmogelijk in kan, voor dit doeleinde, maar Chinezen zijn zeer bekwaam en uiterst vindingrijk, ze maken daar iedere dag het onmogelijke waar. Goed dat is allemaal mooi. Prachtig, je staat dagen vrolijk te springen ver voor het pakket aankomt, misschien wel uit Spanje, dat is meestal een beetje onduidelijk, je weet nooit in welk land het uit China afkomstige product is aangekomen, in de haven of op de luchthaven, gevangen in een bomvolle container of postzak, uitgeladen, en daarna van een nieuw veel hoger prijsje voorzien, vooral als ze het in Nederland uitpakken. Dan komt het dus via zeg maar via Rome, via Grenoble, via Madrid uiteindelijk aan, jij opgetogen, dagen in spanning gezeten, dan uitpakken en hoppa, in elkaar zetten, afstellen, installeren, tentoonstellen of enkel schoonmaken en inzetten voor gebruik, vreugde dansjes, maar dan de dag er na, het pakket is uitgepakt en de opwinding kakt in, er piept een somber gelaat onder het vrolijke door. De koopkrachten lopen uit je lijf, helaas. Het verschijnsel stond bij de gewone, onwetende mensen thuis bekend als de teleurstelling, maar inmiddels dringt het tot ons soort mensen door dat er meer speelt.

Bij onderzoek met tweelingen, ja, de tweelingen, de bron van alle kennis op aard, waarbij de ene een pakket werd bezorgd, na bestelling door hem of haar zelf, een aanwezige team van geneeskundige geschoolde onderzoekers stond er bij en keek er naar, deed metingen, beoordeelde, staafde en nam bloed af, monsterde het een en ander, en daaruit bleek dat er meer speelde. Teleurstelling was er vanzelfsprekend, dat zit inbegrepen bij verwachtingen zo onmogelijk hoog en strak gespannen maar er speelde iets diepers, kwalijks in het tweelingen lijf en in de geest. Bloedwaarden vertoonden opvallende schommelingen, vooral de lever functie toonde onnodige mankementen, krampen ontstonden plotsklaps in de vingers en handen een paar dagen na het uitpakken. Dit kon allemaal nog net door de beugel van het team, het was op zich binnen normale perken, het kon minder maar moest beter, daar ben je tenslotte geneeskrachtige burger voor, een krachtige, tegen alle kwalen bestendige persoon, in dienst om mensen altijd overal beter te maken, gezondheden promoten.

De andere tweeling zonder pakket, slechts zijdelings betrokken bij de bestelling en het ontvangen ervan, natuurlijk wel op de hoogte want tweelingen delen vanzelfsprekend alles. Geen pakket kan de ene ontvangen zonder dat de andere(n), meerlingen zijn ook uitmuntende kennis bronnen, het weet. Ze voelen het aankomen. Het team deed dezelfde metingen en testen op deze niet ontvanger van de signalen vooraf en het succes tijdens het verkrijgen, na de confrontatie met de bezorgdienst. Ze hadden vooraf bepaald dat er mogelijk iets meetbaars zou zijn in de andere maar niks significants, kleine onbeduidende schommelingen in het humeur en de waarden, misschien wat buikkrampen en iets wijdere pupillen of zo. Pakket ontvangst is persoonlijk, niet waar.

Tot hun verbazing bleek dat bij de tweeling broers, het waren toevallig allemaal mannen die zich hadden ingeschreven voor dit onderzoek, alle bij het onderzoek betrokken tweeden ook duidelijk af te lezen verschijnselen voor deden! Het team was zowel geschrokken als opgewonden van zoveel potentieel kwaad, ongezonde bijwerkingen bij pakketten niet alleen bij het pakket individu maar zelfs in zijn naaste omgeving. Het onderzoek werd ingebracht bij de keuringsdienst voor medische waarheden en er werd goedkeurend geknikt door de commissie van zeven wijze heren en twee dames, hier zat wel toekomst in. Een vervolg onderzoek mocht er komen, en dus kwam die er, ditmaal grootschalig dus niet alleen met tweelingen en meerlingen maar ook met echte mensen.

Er was een balletje ontstaan en die is gaan rollen, en rollen, van boven naar beneden, van links naar rechts, terug onder een tunnel, over een midgetgolfbaan en zo voorts, nimmer dralend alsmaar draaien en keren eindeloos lang. Het pakket onderzoek Nederland, werd snel opgestart, sneller nog dan Postnl kan leveren, zeker sneller en beter dan UPS. 10000 mensen deden mee aan het onderzoek, ze kregen geld van de onderzoeksinstelling en daarmee moesten ze dan via het universeel ziekenhuis een pakketje bestellen en thuis ontvangen, de controle groep kreeg geen geld en mocht tijdens het onderzoek geen cadeau online bestellen anders zou dit het gewenste onderzoeksresultaat kunnen beïnvloeden. Iedere besteller, bestelling werd op de voet gevolgd dankzij medische track en trace apparatuur en IT technologie, samen met een aantal levende lijve studenten, assistent onderzoekers, en op zekere momenten een expert ingezet, een leidinggevende, het opperhoofd van de grote stamcel.

Het pakket moest thuis worden ontvangen door de onderzochte, die persoon was de gehele tijd voorzien van de modernste snufjes in medische pakket ontvangst meet technieken, over zijn hele lichaam, in de nabijheid, als ook in huis op voor het onderzoek belangrijke locaties. Elke knipper met de ogen werd gesignaleerd, bloeddruk verschillen gemeten voor, tijdens en na de afdracht, hartslag, spierspanningen, bloedwaarden, uitwerpselen. Een langdurig psychologisch onderzoek was er al aan vooraf gegaan en tijdens de pakket fase moest er iedere om de twee uur een evaluatie worden uitgevoerd met gebruik van de pakjes app, zodat duidelijker werd wat het ontvangen van een geschenk allemaal doet met de nogal labiele geest van de wel iets willende mens. Kortom het was een peperduur onderzoek vooral ook omdat de pakketjes markt nog altijd groot is en als er dus iets mis is met dit proces er heel veel aan gedaan moet worden om de ontvangende mens persoon beter te maken dan deze is. Pakketjes ontvangen kon weleens het nieuwe asbest of het andere roken zijn erger nog dan zitten! Dan mag je als gemeenschap niet aarzelen , groot geld moet worden uitgegeven om de risico's te leren kennen, en dan nog meer budget om de onschuldigen te beschermen van andermans of hun eigen driften.

Al snel werden verschillende mogelijk kwalijke dingen ontdekt, ten eerste frustraties, pakketjes komen eigenlijk zelden zo snel als ze zouden moeten komen, steeds vaker komen ze helemaal niet of met dagen vertraging, de psyche van de mens kan daar niet tegen. Het psychologen team viel bijna om van verbazing, een reeks aan gevolgen was waarneembaar bij de niets of laat ontvangende, zeg maar de bijwerkingen van slechte pakket processen. Ze zagen dat mensen op zo'n overspannen moment vaker hun toevlucht zochten in verdovende middelen vooral drank, peuken, snoepjes, chocolade en pillen, als ook toename van klein huiselijk geweld vaak op planten, kleine huisdieren, en zelfs hier en daar een kind (meestal van de buren), ook ontstonden uit het niets huilbuien en woede uitbarstingen, anderen vluchten dan weer in een fantasie wereld of gingen zich te buiten aan grensoverschrijdend seksueel gedrag. Dit ontstond meestal al als een pakketje een dag te laat aankwam, drie dagen was voor de meesten de lijn, de mentale grens, tussen normaal en ziekelijk gefrustreerd gedrag.

Dat was al erg maar erger nog was het post post pakket ontvangst effect, in het bloed zaten een paar uur na ontvangst al stoffen die je eigenlijk alleen maar ziet bij mensen die last hebben van zeer zware depressies, het humeur was significant slechter een uur na ontvangst, daarbij aangetekend dat het humeur daags voordien significant beter was, maar de slechte kant was beduidend slechter dan de goede kant goed was. Dit was allemaal meer dan een beetje zorgelijk, er moest ogenblikkelijk een campagne komen en wel op alle mogelijke manieren bij alle mogelijk media, zodoende kwam ook hier bij VVA die vreselijke bericht binnen. U moet weten dat ik best vaak pakketjes heb besteld, Ik dacht dat het goed was voor mij en zeker ook de economie en dan dus ook voor u, doe ik toch mijn best, voor de hele wereld zette ik mij in, maar nu zie ik in dat ik me heb vergist, dat ik heb gezondigd, ongezond gedrag heb aangeleerd en dat ik ogenblikkelijk moet stoppen met bestellen. Ik eis daarom meteen, dat alle kanalen, omroepen, vloggers en bloggers per direct stoppen met het verzenden van boodschappen die er op uit zijn om mij, ons, de onschuldigen in alle kwesties, dingen te laten kopen, bestellen of anders in een stenen winkel met een etalage en of een zelfscan, want laten we wel zijn als deze vlieger opgaat, en die gaat op, kilometers omhoog, zo hoog als de zon en daar voorbij. Ik heb het rapport hier voor me, de samenvatting dan en de A4 daar weer in met de belangrijkste uitkomsten gelezen, als dit opgaat voor online pakketjes dan zijn kadootjes in de gewone winkel waarschijnlijk even erg! Dus weg met deze ongezonde manieren, dit nieuwe zitten roken, met een spuit in de aderen en een kratje bier nabij, al gamend, ongezond gedrag maar minder ongezond dan pakketjes ontvangen. Ik eis politiek ingrijpen! Ik heb het programma bestudeerd van alle lokale landelijke partijen, en och, ach en wee bij niet geen enkele staat dit grote nationale gezondheidsprobleem op de beleid, beheer en bestuursagenda. Het moet er op! Het is jullie vaderlanders lievende plicht om te voorkomen dat wij lijden aan jullie pakketjes en dergelijke. Interventie, hoor, hoor!

Ik krijg binnenkort een pakketje, 8 dagen te laat! Ik heb weet ik veel wat gedaan in die tijd tussen de gewenste aankomst en de echte. De drugs kwamen bijna uit mijn op de post en mijn da da da da bel gespitste oren, zeven uur 's ochtends zat ik wederom laveloos aan het graan ontbijt, iedere dag zat ik in de spreekkamer van de huisarts met een vers verzonnen kwaal, een schreeuw om hulp. Ik kon door de pakketstress niet eens meer normaal thuis werken, en dit moet allemaal worden genezen. De bron ervan aangepakt en dan komt eindelijk de rust terug op en in Aard. Mooi dit zit er op, straks op de bank app even kijken of ik het grote geld heb ontvangen van het Universeel Gezondheidsinstituut voor deze dringend noodzakelijke mededeling, dan kan de VVA omroep weer een paar jaar voortgaan met verzenden van dringende noodzakelijke berichten betreffende alles wat er even toe doet.

 
Lees verder...

from Nanat83

Sesuai dengan janjinya dengan Nana, setelah selesai bekerja, Archen tidak balik ke rumahnya sendiri. Namun, kini ia menuju ke rumah orang tua nya.

//Di Rumah EarthMix//

Saat sudah sampai, gerbang rumah EarthMix di buka oleh Pak Sam (satpam). Archen menurunkan kaca mobilnya, ingin menyapa pak Sam.

“Soree pakk” sapa Archen “Soree nak Archen, loh tumben mampir, kangen pak Mix yaa?” kata Pak Sam sambil tersenyum ramah

“hehehe iyaa dan kebetulan hari ini ada waktu luang,pak” jawab Archen “tapi si bapak lagi ada tamuu loh nak Archen” info Pak Sam

'eh “dia” udah datang deluan kah?'-batin Archen

“loh siapa pak?” tanya Archen memastikan terkaannya

Wajah Pak Sam terlihat kikuk sambil menggaruk lehernya yang tak gatal, seakan Pak Sam enggan untuk menjawab pertanyaan dari Archen.

“anu_itu....katanya si bapak sihh yaa...ituu..emm...calon mantunya si bapak..hehee, tapi saya ga tauu ya... mungkin si bapak lagi bercandaa” kata Pak Sam ragu, seakan takut salah menyampaikan info ke Archen

“hahahaa Nana lagi bercanda kali pak” kata Archen ikutan canggung “hehe kayaknya iya deh” jawab Pak Sam sambil tersenyum kikuk

“ya udh Archen masuk dulu ya pak” kata Archen dan hanya dapet anggukan dari Pak Sam. . . . . Archen memasuki rumah bernuansa klasik tersebut. Ia mencoba tidak berteriak memanggil Nananya, karena ada “seseorang”. Archen ingin citranya tetap terjaga, dan pastinya ia tidak ingin terlihat seperti anak-anak. Jujur walaupun perjodohan ini belum pasti berujung ke pernikahan. Namun, Archen tetap ingin mempertahankan posisinya menjadi same.

//HAHAHAA...Nata harus lihat foto iniii..//suara dari ruang tamu

Archen menuju ke sumber suara tersebut. Ia melihat Nananya bersama seseorang pria, sedang asik bercanda sambil melihat suatu barang?

“Nana...” panggil Archen sambil mendekat ke mereka.

'stay cool Archenn, stay cool....'-batin Archen

“Loh Cheniee udah datang yaa, sinii duduk samping Nataa” kata Mix melihat anaknya itu.

Disisi lain, Nata yang awalnya santai berubah menjadi gugup. Nata menatap Archen, posisi duduknya menjadi tegap. Archen duduk pas di samping Nata, ia menatap Nata lekat. Nata yg ditatap sedikit risih.

'ANJINGGGG DSIFEHFUSHH INI APAAN SIH!!! KOK GA KAYAK YANG DI FOTO'-batin Archen

‘ini beneran cwo kan yaa? Kok gw ga percaya? Bentukan begini bertitid? INI GW DI BOHONGIN NANA GA SIH??’-batin Archen

‘langsung cipok gimanaa? HEHEHE’-batin Archen

Dunk Natachai Boonprasert

.

.

“Cheniee..Archen..Joong Arhen!!!” panggil Mix emosi “hah..eh..apa sih Naa, buat Archen kagett” kata Archen sambil mengelus dada nyaa. “kamu yang kenapaa! Ngeliatin Nata begitu bangett!” kata Mix Archen hanya tersenyum kikuk

“ya sudah, kalian kenalan dulu yaa” kata Mix sambil mengelus bahu Nata, Nata hanya tersenyum dan menganggung. “Kamu Archen, ajak Natanya ngobrol yaa, Nana mau telpon Babamu dulu” kata Mix, dan pergi meninggalkan Archen dan Nata berdua.

“emm kenalin aku Joong Archen, terserah kamu mau panggil apa” kata Archen sambil menjulurkan tangannya, Nata menyambut tangan Archen. “ehh aku Dunk Natachai, panggil Dunk ajaa” kata Nata “kenapa ga Nata?” sahut Archen cepat.

“sebenarnya itu nama panggilan buat orang-orang terdekat, tapi kalau kamu mau panggil Nata juga gapapa” jawab Nata. “oow terus anak kamu namanya siapa, Nata?” tanya Archen sambil menatap Nata dengan lekat.

Nata sedikit kaget, padahal ia sudah memikirkan banyak kemungkinan tentang reaksi Archen saat tau Nata sudah memiliki anak. Tapi semua kemungkinan buruk itu hilang saat Archen menanyakan anaknya dengan wajah biasa saja. Walaupun belum tentu Archen menerima anaknya, tapi setidaknya ia tidak perlu berusaha untuk memberitahu soal Jaidee.

“namanya Jaidee Boondin Boonprasert, dia masih berumur 4 tahun” jawab Nata. “dia ikut marga ibunya? Kenapa tidak ikut marga ayahnya saja, misalnya Jaidee Boondin Aydin” Archen mencoba menggoda duda anak satu yang ada di depannya saat ini.

“hah? Boonprasert itu marga ku BTW, aku AYAH nya Jaidee” jawab Nata sedikit ketus. “eh..oh iya...aduh maksud aku..anuu” Archen tidak bisa berkata-kata, dia tidak menyangka Nata akan marah dengan godaan nya. Archen menghela nafas sejenak. “maaff..” lirih Archen.

‘pantesan ga nikah-nikah, ternyata anaknya sefreak ini’-batin Nata

. . . Archen dan Nata tidak banyak berbincang, mereka kelihatan kaku dan canggung.

 
Read more... Discuss...

from 下川友

高熱4日目。 今までの蓄積と精神力を考えると、今日が一番つらい。

菌による発熱ではないと判断されたので、だいたい疲れからくる熱だろうと思っていた。 本来なら、1日ぐっすり寝れば治るものだと思っていたのに、もう4日も38.5℃あたりをうろうろしている。

頭が痛すぎて眠れなくなったので、薬をカロナールからロキソニンに変えてもらった。 ロキソニンを初めて飲んだが、すごい効き目だ。 あの頭の痛みがスッと消えた。

たぶんロキソニンを常用している人の間では、かなりあるあるなんだろうが、あの激痛が一気に引くのは怖すぎる。 実際には痛みが存在していて、裏では引き続き悪い状態が進んでいるんじゃないかと想像すると、余計に怖い。

早くロキソニンがいらない生活に戻りたい。 痛みもなく、急に体がガラガラと崩れ出したら困るから。

この4日間で、体が言うことを聞かなくなり、全身がしびれて動けなくなったことが2回あった。 これはまずいと思い、救急車を2回呼んだ。

しかし、2回とも救急隊員が来た時には少し体の自由が戻り、立てるようになっていた。 そのせいで「動けるじゃないか」と思われたのか、あまり重症とは受け取られず、結局自分の足で病院に行くことになった。

「救急車には乗れますが、受診できるかは分かりません。自分で予約して自分で歩いて行くのと変わりませんが、どうしますか?」 と聞かれたが、なんで具合の悪い人に判断を聞いてくるんだろうと思う。

実際、病院に行ったあとも「今日は何の検査をしますか」と聞かれた。 症状は伝えているのに、なんでそっちが判断してくれないんだろう。

なのに声はずっと優しいままで、主導権はそっちにあるまま、優しい怖い声で、 ハイコンボイスで話しかけてくるから、なんだか気分が悪い。

なんでこっちが判断するんだよ、と思う。 レジ袋のサイズをこっちに判断させてくるものに近い。 なんで判断させてくる工程があるのか未だに理解できない。

明日は自分ひとりなので、妻にはインスタントのおじややうどんを帰りに買ってきてくれるよう頼んでおいた。 明日起きたとき、自分の体が粉になっていないことを祈る。

 
もっと読む…

from Un blog fusible

JOURNAL 16 mars 2026

On est allées visiter un peu la montagne. On est montées vers un endroit dégagé, au loin un peu brumeux on voyait le sanctuaire musashi mitake puis encore plus loin en bas tôkyô immense dans sa fumée gris bleu, et dans un silence qui serait étourdissant si on avait pas les oiseaux et un peu de vent dans les arbres, les branches sèches grincent un peu en se frottant. Le sol est doux sous les pieds, mousses et feuilles mortes ou alors les épines des grands pins. On respire à plein la tête au ras des nuages. C’est drôle comme on aime monter. On aime être en haut plus que dans la plaine en bas. Nos jambes réclament des pentes à gravir.

Il fait froid. La nuit a été glaciale dans le ryôkan. On avait un radiateur dans la chambre, avec double couette ça allait. On a eu un bon petit-déjeuner servi par la jeune fille un peu timide, un peu sauvage. C'était charmant. En cette saison ils n’ont jamais de visiteurs. Elle ne s'ennuie pas, elle élève des lapins rigolos avec les oreilles pendantes. Son grand-père est calligraphe. Ses parents sont partis chacun de son côté en la laissant à ses grands-parents qui l'ont élevée. Classique. Tellement vu. Elle n’a pas envie d'une autre vie. Je ne peux pas lui donner tort. Il va faire nuit dans un peu plus d’une heure je pense. On fait une pause, il y a du réseau, puis on va revenir à l'auberge tranquillement. On ne connaît pas alors on préfère rentrer avant la nuit.

* * * *

On a bu la bière avec les gens du hameau très gentils et mangé encore du porc fumé avec les tsukemono, on a passé un bon moment super gentil. Après, le bain dans la baignoire traditionnelle en bois, très confortable. Et maintenant dodo : deux couettes et le radiateur électrique. — On se caille, la nuit est très froide. On sera à tôkyô demain en fin de journée on ira peut être à l'hôtel. On ne sait pas encore. On va improviser c'est très agréable.

 
Lire la suite...

from G A N Z E E R . T O D A Y

“In 1378, two years before Poggio's birth, the seething resentment of these miserable day laborers, the populo minuto, had boiled over into a full-scale bloody revolt. Gangs of artisans ran through the streets, crying, 'Long live the people and the crafts!' and the uprising briefly toppled the ruling families and installed a democratic government. But the old order was quickly restored, and with it a regime determined to maintain the power of the guilds and the leading families.”

First time for me to hear of this revolt. Also from Stephen Greenblatt's THE SWERVE, which covers so much ground.

“Poggio's way of fashioning letters was a move away from the intricately interwoven and angular writing known as Gothic hand. The demand for more open, legible handwriting had already been voiced earlier in the century by Petrarch (1304-1374). Petrarch complained that the writing then in use in most manuscripts often made it extremely difficult to decipher the text, 'as though it had been designed,' he noted, 'for something other than reading.'”

Extremely my shit on so many levels this book.

#reads

 
Read more... Discuss...

from Crónicas del oso pardo

Ahora se exalta a los pioneros de Internet. La primera página web, el primer vídeo, el primer chat... Yo estoy de acuerdo, se lo merecen. No hay duda de que todo esto ha transformado a nuestra sociedad. Pero... ¿y yo qué?

En la temprana Internet, yo tenía veinticuatro años. Era un joven alegre de Minnesota. En ese entonces, una prometedora empresa de Nueva York me ofreció, como a otros, una cuenta gratuita como probador de lo que vendría a ser una nube primitiva para guardar archivos digitales.

No daban mucho espacio, pero para mí era suficiente. Guardé documentos e imágenes. Fue una maravilla. No subía carpetas, sino archivos individuales. Había que hacer cada carpeta y luego subir los archivos uno a uno. El trabajo no era poco.

Fui un creyente sincero de que este programa iba a transformar a la humanidad. Y así, trabajé cada día, por varios meses, para que el programa fuera eficiente, hasta que un día al meter la contraseña no abrió. Estaba como muerto. Pensé que podía ser un fallo de mi computadora, del servidor de Internet. Pero no. Y dándole vueltas a la cabeza, me dije: “si el problema persiste, mañana escribiré al soporte”.

A la mañana siguiente, la página de la primitiva nube desapareció y los correos que envié al soporte rebotaron uno a uno. Hasta que se me fue cortando la respiración y caí en picado.

Sí, yo fui el primer deprimido de Internet y reclamo mi lugar en la historia.

 
Leer más...

from G A N Z E E R . T O D A Y

“Ancient Greeks and Romans did not share our idealization of isolated geniuses, working alone to think through the knottiest problems.”

From Stephen Greenblatt's riveting book, THE SWERVE: HOW THE WORLD BECAME MODERN.

“Such scenes—Descartes in his secret retreat, calling everything into questions, or the excommunicated Spinoza quietly reasoning to himself while grinding lenses—would eventually become our dominant emblem of the life of the mind. But this vision of proper intellectual pursuits rested on a profound shift in cultural prestige, one that began with the early Christian hermits who deliberately withdrew from whatever it was that pagans valued: St. Anthony (250-356) in the desert or St. Symeon Stylites (390-459) perched on his column. Such figures, modern scholars have shown, characteristically had in fact bands of followers, and though they lived apart, they often played a significant role in the life of large communities. But the dominant cultural image that they fashioned—or that came to be fashioned around them—was of radical isolation.”

I maintain that the very notion of the “isolated genius” today—still prominent in contemporary culture—is an absolute fiction. No man is an island, though certain aspects of one's environment or social makeup may drive a person to feel they are an island, but true and genuine absolute islandhood is bound to spell the death of one's creative life.

“Not so the Greeks and Romans. As thinking and writing generally require quiet and a minimum of distraction, their poets and philosophers must have periodically pulled away from the noise and business of the world in order to accomplish what they did.”

The keyword here, being “periodically.”

“But the image that they projected was social. Poets depicted themselves as shepherds singing to other shepherds; philosophers depicted themselves engaged in long conversations, often stretching out over several days. The pulling away from the distractions of the everyday world was figured not as a retreat to the solitary cell but as a quiet exchange of words among friends in a garden.”

House-sitting for friends now, and spotted the book on one of the shelves. Decided to flip through it a few days ago and haven't been able to put it down since.

“The invention of movable type in the fifteenth century changed the scale of production exponentially, but the book in the ancient world was not a rare commodity: a well-trained slave reading a manuscript aloud to a roomful of well-trained scribes could produce masses of text. Over the course of centuries, tens of thousands of books, hundreds of thousands of copies, were made and sold.

“There was a time in the ancient world—a very long time—in which the central cultural problem must have seemed an inexhaustible outpouring of books. Where to put them all? How to organize them on the groaning shelves? How to hold the profusion of knowledge in one's head? The loss of this plentitude would have been virtually inconceivable to anyone living in its midst.”

Call me a pessimist but I kind of foresee an inevitable halt to the plentitude of data (along with “content”) being collectively churned out by present human civilization as well.

“Then, not all at once but with the cumulative force of mass extinction, the whole enterprise came to an end. What looked stable turned out to be fragile, and what had seemed for all time was only for the time being.”

Those who cannot remember the past are condemned to repeat it, after all.

“Starting as early as 300 BCE, the Ptolemaic kings who ruled Alexandria had the inspired idea of luring leading scholars, scientists, and poets to their city by offering them life appointments at the Museum, with handsome salaries, tax exemptions, free food and lodging, and the almost limitless resources of the library.”

Take me back to ancient Alexandria please.

“The recipients of this largesse established remarkably high intellectual standards. Euclid developed his geometry in Alexandria; Archimedes discovered pi and laid the foundation for calculus; Eratosthenes posited that the Earth was round and calculated its circumference to within 1 percent; Galen revolutionized medicine. Alexandrian astronomers postulated a heliocentric universe; geometers deduced that the length of a year was 365 ¼ days and proposed adding a “leap day” every fourth year; geographers speculated that it would be possible to reach India by sailing from Spain; engineers developed hydraulics and pneumatics; anatomists first understood clearly that the brain and the nervous system were a unit, studied the function of the heart and the digestive system, and conducted experiments in nutrition. The level of achievement was staggering.”

Let us remind ourselves that what inevitably consigned all that achievement to oblivion and brought it all down was the prejudice exercised by the Roman Empire—some 270 years after Ptolemaic rule—against one of its minority groups/cults concentrated in the eastern provinces, and—as the story goes—charging one member of that minority with treason against the empire before executing him.

#reads

 
Read more... Discuss...

from An Open Letter

I went to the concert by myself, and I had the extra ticket for E. I tried inviting a ton of friends, even made a post on my instagram story, but no one was free and so I ended up just going by myself. I wore my stupid little night time costume, and I just decided that nothing stopped me from larping someone who is super extroverted and sociable. I went to the venue and I talked to a ton of people, and even had a girl come up to me and show me a picture of her in the same costume, and we talked off and on throughout the concert, and exchanged instagrams. At one point in the concert the artist asked if there were any lovebirds in the crowd and had a few cheers, then said something about a lot of singles or something, and I yelled “I just broke up with my ex” and we talked a bit back and forth mid concert. I said how I had an extra ticket because we were supposed to go together, and she talked about how she loved my energy and my costume, and asked how I was holding up etc. The crowd cheered for me and one guy yelled “fuck her!” and gave me a fist bump, and a guy next to me gave me like a happy hug. After the show I talked with the band, and got a picture with them which was great! I also got stopped by the drummer of one of the openers, and we talked for a while because he thought my costume was hilarious. I talked with so many different people, and even had a guy next to me ask for my instagram because we were dancing together. I had several people approach me and compliment my outfit, and I’m just overall very proud with myself for going.

 
Read more...

from Iain Harper's Blog

In September 2025, OpenAI published a paper that said something the AI industry already suspected but hadn’t quite articulated. The paper, “Why Language Models Hallucinate”, authored by Adam Tauman Kalai, Ofir Nachum, Santosh Vempala, and Edwin Zhang, didn’t just catalogue the problem. It pointed the finger at the evaluation systems that are supposed to keep models honest and argued that those systems are actively making hallucination worse.

The paper’s central argument is disarmingly simple. Language models hallucinate because we reward them for guessing. The training loops, the benchmarks, the leaderboards that determine which model gets called “best” all operate on a scoring system that treats confident wrong answers and honest uncertainty as equally worthless. Under those rules, the rational strategy for any model is to always take a shot, even when the evidence is thin. And that strategy produces hallucinations.

Researchers have known for years that models tend toward overconfidence. But the OpenAI paper formalised it with mathematical precision and made an argument that goes further than most. The problem is that our entire evaluation infrastructure systematically incentivises the specific failure mode we claim to care most about fixing.

An illustration representing hallucination

The Mechanics of Making Things Up

To understand why the paper matters, it helps to start with what hallucination actually is at a mechanical level.

During pretraining, a language model learns to predict the next token in a sequence. It ingests billions of documents and builds a statistical model of what words tend to follow other words in what contexts. This process is extraordinarily powerful for capturing patterns, grammar, reasoning structures, and factual associations. But it has an inherent limitation that no amount of scale can fully overcome.

Some facts appear in training data frequently enough that the model can learn them reliably. The capital of France, the boiling point of water, the year the Berlin Wall fell. These are high-frequency, well-attested facts that leave strong statistical signals. But other facts appear rarely or only once. The title of a specific researcher’s PhD dissertation. The birthday of a mid-career academic. The precise holdings of a niche legal case from 2019. These “singleton” facts leave weak or ambiguous traces in the training distribution, and no model, regardless of size, can learn them with confidence from pattern matching alone.

The OpenAI paper draws an analogy to supervised learning that makes this intuitive. In any classification task, there’s an irreducible error rate determined by the overlap between classes in the training data. Generative models face an equivalent problem, because some questions simply cannot be answered correctly from the training distribution, and the model’s best option in those cases would be to say “I don’t know.” The paper refers to this as the model’s “singleton rate,” the fraction of facts that appeared only once during training and therefore can’t be reliably recalled.

This matters because it puts a hard floor under hallucination rates regardless of model size or architecture. You can make a model bigger, train it on more data, and give it better reasoning capabilities, and you will reduce hallucinations on well-attested facts. But you will never eliminate them on rare facts, because the statistical signal for those facts is too weak to distinguish from noise. The paper is explicit about this point. Even a 100% accurate model on common facts would still hallucinate on singleton facts, and the only alternative to hallucination on those facts is abstention.

None of this is mysterious. It’s basic statistics applied to language modelling. But what happens next, in the post-training phase, is where things go wrong in a more avoidable way.

The Test-Taking Incentive Problem

After pretraining, models go through rounds of fine-tuning designed to make them more helpful, less harmful, and better at following instructions. This process involves evaluation on benchmarks, and it’s here that the OpenAI paper identifies the core dysfunction.

The paper’s authors compare modern AI benchmarks to multiple-choice tests where leaving an answer blank guarantees zero points. On such tests, the optimal strategy for a test-taker who doesn’t know the answer is to guess. There’s some chance of being right, and no additional penalty for being wrong. Language model benchmarks work on the same principle, and most prominent evaluations, including MMLU-Pro, GPQA, MATH, and others that dominate public leaderboards, use binary scoring where a correct answer scores one point and everything else, whether wrong or abstained, scores zero.

Under this system, a model that says “I don’t know” to a question it’s uncertain about gets exactly the same score as a model that confidently invents an answer. But the model that guesses will occasionally be right by chance, which pushes its aggregate accuracy higher. Since accuracy is the number that appears on leaderboards, in model cards, and in press releases, the models that guess most aggressively tend to look best.

The paper illustrates this with a concrete example from SimpleQA-style metrics. One model showed an error rate of 75% with only 1% abstentions, meaning it almost never admitted uncertainty and was wrong three-quarters of the time when it did answer. Another model abstained 52% of the time and dramatically reduced its error rate. But on a traditional accuracy-only leaderboard, the difference between these two models would look modest, because the metric that gets reported doesn’t distinguish between “wrong” and “chose not to answer.”

This is not an edge case in how benchmarks work. It’s the dominant paradigm. As the paper puts it, the majority of mainstream evaluations reward hallucinatory behaviour. The proposed fix is almost embarrassingly obvious, and borrowed directly from standardised testing. Introduce negative marking for wrong answers, or give partial credit for appropriate expressions of uncertainty, so that honest non-answers score better than confident mistakes.

Looking Inside the Black Box

While OpenAI approached the problem from the evaluation and incentive angle, Anthropic’s interpretability team was working on the same question from the opposite direction, looking at what actually happens inside a model when it decides whether to hallucinate or abstain.

In March 2025, Anthropic published two papers under the banner “Tracing the Thoughts of a Large Language Model” that used a novel “AI microscope” technique to map the computational circuits inside Claude 3.5 Haiku. Among the results was a discovery that runs counter to most people’s intuitions about how hallucination works.

It turns out that Claude’s default behaviour is to refuse to answer. The researchers identified a circuit that is active by default and causes the model to state that it has insufficient information to respond to any given question. This “I don’t know” circuit fires every time Claude receives a query, regardless of the topic. For the model to actually produce an answer, a competing mechanism has to override it. When Claude is asked about something it knows well, a “known entity” feature activates and inhibits the default refusal circuit, allowing the model to respond.

Hallucinations happen when this override misfires. The researchers showed that when Claude recognises a name but doesn’t actually know much about the person, the “known entity” feature can still activate, suppressing the refusal circuit and pushing the model into fabrication mode. By artificially manipulating these circuits in experiments, they could reliably induce hallucinations about fictional people, and by strengthening the refusal circuit, they could prevent them.

This result reframes hallucination as a circuit imbalance rather than a deep-seated flaw. The model already has the machinery to recognise uncertainty and decline to answer. The problem is that this machinery sometimes loses the tug-of-war with the model’s competing drive to produce fluent, helpful-sounding output. And that drive is reinforced by training regimes and evaluations that treat helpfulness as the primary virtue and treat caution as a failure.

The interpretability work and the OpenAI incentives paper are telling the same story from different vantage points. One looks at the external pressures that shape model behaviour and the other looks at the internal mechanisms those pressures create. Both arrive at the same conclusion. Models don’t hallucinate because they’re broken. They hallucinate because the systems we’ve built around them reward confident output and punish honest uncertainty.

Not All Hallucinations Come From the Model

The OpenAI and Anthropic work both locate hallucination inside the model, whether in its training incentives or its internal circuits. But a September 2025 paper in Frontiers in Artificial Intelligence by Anh-Hoang, Tran, and Nguyen adds a third variable that most evaluation frameworks ignore entirely, and that variable is the prompt itself.

The paper introduces formal metrics for separating prompt-induced hallucinations from model-intrinsic ones — three new acronyms to quantify what practitioners already know, which is that bad prompts make bad outputs worse. Conditional Prompt Sensitivity (CPS) measures how much hallucination rates change when you vary the prompt while holding the model constant. Conditional Model Variability (CMV) measures the reverse, how much rates change across models given the same prompt. A third metric, Joint Attribution Score (JAS), captures the interaction effect between the two.

The results are unambiguous. Vague, underspecified prompts dramatically increase hallucination rates in some models but not others. LLaMA 2 showed CPS values of 0.15 under ambiguous prompting, meaning prompt design accounted for a large share of its fabrication behaviour. GPT-4, by contrast, was far less prompt-sensitive (CMV of 0.08), suggesting its hallucinations were more model-intrinsic and less dependent on how the question was framed. Structured prompting techniques like Chain-of-Thought reduced CPS to 0.06 across the board, a meaningful drop that required no model changes at all.

The practical implication is that hallucination isn’t always a model problem. Sometimes it’s a prompting problem, and sometimes it’s both at once. Models with high JAS scores, like LLaMA 2 under ambiguous prompts (JAS of 0.12), show compounding effects where weak prompts and model limitations multiply each other’s worst tendencies. This means the standard evaluation practice of testing models with fixed prompt templates and attributing all variation to model quality is systematically misleading. Two teams using the same model with different prompt architectures could see wildly different hallucination rates, and neither team’s experience would be wrong.

This reframes the question of responsibility. If a model hallucinates because the prompt was ambiguous, is that a model failure or a deployment failure? Current benchmarks don’t ask this question. They test models under controlled prompting conditions and report a single hallucination rate, flattening a two-dimensional problem into one number. The Frontiers paper suggests that useful evaluation would need to test across a range of prompt qualities, measuring how often a model hallucinates and how sensitive it is to the way questions are asked.

How Evaluation Is Changing (Slowly)

Newer benchmarks are starting to incorporate abstention as a legitimate outcome, but they remain a minority voice in a field still dominated by accuracy-only scoring.

SimpleQA, released by OpenAI in late 2024, treats abstention as a first-class outcome. Each response is graded as correct, incorrect, or not attempted, which makes it possible to measure whether a model knows what it doesn’t know. This is a meaningful step, and the benchmark has been widely cited. But it covers only 4,326 short factual questions with single correct answers, which makes it narrow by design and increasingly saturated. GPT-4o with web search now reaches around 90% accuracy on SimpleQA, and GPT-5 with search and reasoning pushes above 95%, which means the benchmark is approaching its ceiling for models with access to external tools.

HalluLens, presented at ACL 2025, takes a broader approach. It includes multiple task types (short-form QA, long-form generation, and nonexistent entity detection) and explicitly measures both hallucination rates and false refusal rates, the cases where a model declines to answer something it actually knows. This dual measurement is important because it captures a tradeoff that SimpleQA alone misses.

A model that refuses everything would score perfectly on hallucination metrics but be useless in practice. HalluLens found substantial variation across models, with GPT-4o rarely refusing (4.13% false refusal rate) while Llama-3.1-8B-Instruct refused over 83% of the time. Neither extreme is desirable, and having both numbers visible forces a more honest conversation about what good behaviour looks like.

The most ambitious attempt to embed the OpenAI paper’s recommendations into a practical benchmark may be AA-Omniscience, published by Artificial Analysis in November 2025. Its central metric, the Omniscience Index, does exactly what the OpenAI paper prescribed. Correct answers earn +1 point, incorrect answers cost -1 point, and abstentions score zero. This means a model that guesses and gets it wrong is actively penalised relative to a model that admits it doesn’t know. The scale runs from -100 to 100, where zero means a model is correct as often as it is incorrect.

The results are striking, and somewhat grim. Out of 36 evaluated frontier models, only three scored above zero on the Omniscience Index. Claude 4.1 Opus led with 4.8, followed by GPT-5.1 at 2.0 and Grok 4 at 0.85. Every other model was more likely to hallucinate than to give a correct answer when measured on this basis. Models that look excellent on traditional accuracy benchmarks, including Grok 4 and GPT-5 variants, turned out to have hallucination rates of 64% and 81% respectively when their guessing behaviour was properly penalised.

The most recent entry is HalluHard, published in early 2026, which tackles something the earlier benchmarks mostly ignore. It tests hallucination in multi-turn, open-ended dialogue rather than single-turn factual questions. The reason is that errors compound across turns, and an early hallucination can contaminate the context that the model draws on for subsequent responses, creating a cascading failure that single-turn benchmarks can’t detect. HalluHard found that hallucinations remain substantial even for frontier models with web search access, and that models become progressively more prone to fabrication as conversations grow longer.

One of HalluHard’s more interesting results involves the interaction between reasoning ability and abstention. While more effective reasoning generally reduces hallucination, the effect is model-dependent. GPT-5.2 with reasoning enabled abstains significantly more than its non-reasoning counterpart, especially on niche knowledge questions, suggesting that deeper thinking makes the model more aware of its own knowledge boundaries. But this pattern doesn’t hold universally, and some models show the opposite behaviour, where reasoning makes them more confident rather than more cautious.

The benchmark also confirmed something the OpenAI paper predicted, that models struggle most with niche facts that have some trace in training data rather than with completely fabricated entities. When asked about something entirely made up, models are more likely to recognise it as unfamiliar and refuse to answer. But when asked about something they vaguely recognise without knowing well, they tend to guess, because the partial familiarity triggers the “known entity” response that Anthropic’s circuit analysis identified.

Work at the training level points in a more encouraging direction. A December 2025 paper on behaviourally calibrated reinforcement learning showed that a 4-billion-parameter model trained with proper calibration incentives could match or exceed frontier models on uncertainty quantification, despite being orders of magnitude smaller. The model’s signal-to-noise ratio gain (measuring the ratio of correct answers to hallucinations) substantially beat GPT-5 on challenging mathematical reasoning tasks, suggesting that teaching models when to abstain is a skill that can be learned independently of raw knowledge.

Where Evaluation Still Falls Short

Despite this progress, the structural problems the OpenAI paper identified remain largely intact. There are at least four ways in which the current evaluation system continues to fail.

The leaderboard problem persists. The benchmarks that drive public perception, model selection, and commercial decisions are still overwhelmingly accuracy-only. When a new model launches, the numbers that appear in the announcement blog post are accuracy on MMLU, pass rates on SWE-bench, scores on GPQA Diamond. These are the metrics that journalists report, that enterprise buyers compare, and that engineering teams optimise for. Benchmarks like AA-Omniscience and HalluLens exist but remain niche, and until the headline number on a model card includes a hallucination-penalising metric alongside accuracy, the incentive structure the OpenAI paper described will continue to push models toward confident guessing.

Single-turn factuality is an inadequate proxy for production behaviour. Most hallucination benchmarks test whether a model can correctly answer isolated factual questions. But the failure modes that actually hurt people in deployment are different. They involve subtle distortions in summaries, fabricated citations in legal research, invented details woven into otherwise accurate reports, and cascading errors in multi-turn conversations. HalluHard is a step toward tackling this, but it remains a single benchmark. The gap between “can this model answer trivia correctly” and “will this model produce reliable output in my specific workflow” is enormous, and very few evaluations attempt to bridge it.

Domain-specific hallucination is underexplored. AA-Omniscience shows dramatic variation across domains, with different models leading in different domains. A Stanford study in the Journal of Empirical Legal Studies found that even purpose-built legal AI tools like Westlaw AI produce responses that are not significantly more trustworthy than general-purpose models, with hallucinations that require close analysis of cited sources to detect.

A study in npj Digital Medicine found that GPT-4o hallucinated at a 53% rate on medical questions before targeted mitigation, dropping to 23% with improved prompting. These domain-specific rates are far higher than the aggregated numbers that appear on general leaderboards, and they vary in ways that general-purpose benchmarks don’t capture.

Retrieval-augmented generation doesn’t solve the problem. There’s a widespread assumption that giving models access to external documents through RAG architectures eliminates hallucination risk. The evidence doesn’t support this. Vectara’s hallucination leaderboard, which tests grounded summarisation where models are given source documents and asked to faithfully summarise them, still shows non-trivial inconsistency rates across all models tested.

The model can misread the source, over-generalise from it, or fill gaps between retrieved passages with invented material. RAG reduces the frequency of hallucination, but it changes the type rather than eliminating the problem. And because RAG-augmented models often cite their sources, the hallucinations they do produce carry an extra layer of false authority that makes them harder to catch.

The entire evaluation terrain is English-only and text-only. Nearly every benchmark discussed so far tests English-language factual questions in a text-to-text setting. This is a problem because hallucination rates spike dramatically once you step outside that narrow frame. Mu-SHROOM, a SemEval 2025 shared task that tested hallucination detection across 14 languages, found that hallucination rates and detection difficulty vary enormously by language, with low-resource languages showing far worse outcomes than English. The task attracted 2,618 submissions from 43 teams, a sign of the community’s recognition of this gap, and the results confirmed what many suspected. A model that is well-calibrated in English can be wildly overconfident in Swahili or Basque.

The multimodal picture is no better. CCHall, presented at ACL 2025, tests hallucination when models must reason across both languages and images simultaneously. Even the best-performing model (GPT-4o with a multi-agent debate framework) achieved only 77.5% accuracy, with performance dropping 10.9 points compared to handling cross-modal hallucinations alone.

The benchmark also found that longer model responses trigger substantially higher hallucination rates, with a sharp inflection point around 120 words, after which output reliability degrades significantly. These are not obscure failure modes. If you’re deploying a model to handle customer queries in multiple languages, or building a system that reasons over images and text together, your real-world hallucination rate is almost certainly higher than what any English-only benchmark would predict.

Enterprise evaluation is moving in the right direction but slowly. The Bessemer State of AI 2025 report noted that 2025 and 2026 would mark a turning point where AI evaluations go “private, grounded, and trusted,” with enterprises building domain-specific evaluation frameworks tailored to their own data and risk profiles.

This is encouraging, but it is a shift toward bespoke testing that doesn’t feed back into the public benchmarks that shape model development. If enterprises build better evals internally but the public leaderboards remain accuracy-only, the models themselves will continue to be optimised for the wrong thing. The fix needs to happen upstream, in the benchmarks that model developers train against, rather than downstream in the evaluations that buyers run after deployment.

The External Pressure Nobody Planned For

The discussion so far has framed hallucination as an internal industry problem, something the AI field needs to solve through better benchmarks and training practices. But the pressure to fix it is increasingly coming from outside the field entirely.

In June 2023, a New York federal judge sanctioned two lawyers and fined them $5,000 for submitting a brief containing fabricated case citations generated by ChatGPT. The Mata v. Avianca case became the first widely reported instance of AI hallucinations entering the legal system, and it set off a chain reaction. One of the lawyers testified that he was “operating under the false perception that [ChatGPT] could not possibly be fabricating cases on its own.” By mid-2025, courts across the country had moved well beyond fines.

In Johnson v. Dunn (July 2025), a Northern District of Alabama judge declared that monetary sanctions were proving ineffective at deterring AI-generated errors and instead disqualified the offending attorneys from the case entirely. Multiple courts now require attorneys to certify that AI-assisted filings have been manually verified.

The problem extends well beyond law firms, and in January 2026, GPTZero scanned all 4,841 papers accepted by NeurIPS 2025, the world’s most prestigious machine learning conference, and found over 100 confirmed hallucinated citations spread across 51 papers. These included fabricated authors, invented paper titles, and fake DOIs, all of which survived review by three or more expert peer reviewers.

Some were obvious (author names like “John Doe and Jane Smith”), but others were sophisticated blends of real papers with modified titles and expanded author initials. The irony is hard to miss. The leading AI researchers in the world were fooled by the exact failure mode their field is supposed to be studying.

GPTZero had previously found 50 hallucinated citations in papers under review at ICLR 2026, and a separate analysis found that fabricated citations had appeared in US government reports requiring corrections, and in consulting outputs that triggered $98,000 (AUD) refunds.

The pattern is consistent. Hallucinated content doesn’t stop at degrading individual conversations. It enters the official record, whether that’s case law, academic literature, or policy documents, and from there it compounds. Those NeurIPS papers with fake citations will themselves become training data for next-generation models, creating what one researcher called a “self-reinforcing hallucination loop.”

These consequences are materialising faster than the evaluation frameworks are improving. Courts, publishers, and regulators aren’t waiting for the AI field to solve its benchmark problems. They’re imposing external accountability in the form of sanctions and regulatory mandates.

This may end up being the most effective forcing function for better hallucination measurement, not because the field decided to measure the right things, but because the cost of measuring the wrong things became impossible to ignore.

The Collective Action Problem

The deepest issue the OpenAI paper surfaces is structural rather than technical. No individual lab has a strong incentive to score worse on existing benchmarks by making their model more cautious, even if they agree that the benchmarks are measuring the wrong thing. If Lab A trains its model to say “I don’t know” more often and Lab B doesn’t, Lab B’s model will look better on the accuracy-only leaderboards that dominate public comparison. Lab A’s model might be more reliable in practice, but that advantage is invisible to the metrics that drive adoption.

This is a textbook coordination problem. Everyone would benefit from better benchmarks, but nobody wants to be the first to optimise for them at the expense of looking worse on the old ones. The OpenAI paper acknowledges this by framing the solution as “socio-technical,” requiring both a better evaluation and broad adoption of it across the field.

There are signs of movement, though. An August 2025 joint safety evaluation by OpenAI and Anthropic showed the two leading labs converging on “Safe Completions” training that incorporates calibrated uncertainty into model behaviour. Artificial Analysis has folded the Omniscience Index into its Intelligence Index alongside traditional metrics. And newer benchmarks like HalluLens and HalluHard are gaining citations and attention in the research community.

But these are early moves. The central question, whether the field can shift from treating accuracy as the headline metric to treating reliability (accuracy minus hallucination, weighted by abstention) as the headline metric, remains open. Until that shift happens at the level of public leaderboards and model marketing, the incentive structure that produces hallucination will persist even as the models themselves become more capable of avoiding it.

What This Means in Practice

If you’re building with language models today, the practical takeaway from all of this is that you can’t trust aggregate benchmark numbers to tell you how a model will behave in your specific use case. A model that scores 90% on a general factuality benchmark might hallucinate at 50%+ rates in your domain, and you won’t know until you test it on your own data with evaluation criteria that penalise fabrication.

The research points toward a few concrete steps that are worth spelling out. First, when evaluating models for knowledge-intensive tasks, look at metrics that separate accuracy from hallucination rate and include abstention behaviour. The Omniscience Index and SimpleQA’s three-way grading (correct, incorrect, not attempted) provide better signals than raw accuracy alone.

Second, don’t assume that RAG eliminates the problem, and test your retrieval system with adversarial queries and check whether the model fabricates answers when retrieved context is incomplete or ambiguous.

Third, consider domain-specific evaluation, because a model that does well at coding benchmarks may struggle with legal or medical factuality, and general leaderboards won’t tell you that.

Fourth, pay attention to how a model behaves under uncertainty. If it never says “I don’t know” in your testing, that’s a red flag rather than a strength. The AA-Omniscience results showed that models with the highest accuracy often had the worst reliability scores, precisely because they never abstained.

It’s also worth noting that the gap between public benchmarks and production behaviour creates an information asymmetry that benefits model providers at the expense of buyers. A model card that reports 95% accuracy on a factuality benchmark sounds impressive until you learn that the same model hallucinates 60%+ of the time when it encounters questions outside its confident knowledge range. The metrics that count for your use case, things like “how often does this model fabricate a citation” or “what percentage of its medical advice is unsupported by evidence,” are almost never reported in public evaluations. Building your own eval suite, however tedious, remains the only reliable way to understand what a model will actually do with your data.

The OpenAI paper ends with a note that bears repeating. Even a perfectly calibrated model will still produce some hallucinations, because some questions are genuinely unanswerable from any finite training set. The goal isn’t zero hallucinations. It’s a system that knows what it knows, admits what it doesn’t, and is evaluated by metrics that reward exactly that behaviour. We’re not there yet, and the gap between where we are and where we need to be is not mainly a gap in model ability. It’s a gap in how we measure and reward model behaviour. The models are increasingly capable of being honest about their uncertainty. The question is whether we’ll let them.

 
Read more... Discuss...

from EpicMind

Illustration eines antiken Philosophen in Toga, der erschöpft an einem modernen Büroarbeitsplatz vor einem Computer sitzt, umgeben von leeren Bürostühlen und urbaner Architektur.

Freundinnen & Freunde der Weisheit! Wir alle werden regiert vom Negativity Bias. Darum hallt Kritik auch viel länger nach als Lob. Doch wir können uns dem entgegenstemmen.

Ein missmutiger Kommentar im Meeting bleibt länger im Kopf als das spontane Lob am Morgen. Diese Tendenz ist kein Zufall, sondern Ausdruck eines tief in uns verankerten Mechanismus: der Negativitätsverzerrung (Negativity Bias). Unser Gehirn reagiert stärker auf potenzielle Gefahren als auf positive Reize – eine Eigenschaft, die in der Evolution unser Überleben sicherte, heute aber zunehmend zur Belastung werden kann.

Wissenschaftliche Studien zeigen, dass negative Eindrücke im Gehirn intensiver verarbeitet werden und länger nachwirken – mitunter über Monate hinweg. Diese übersteigerte Aufmerksamkeit für das Schlechte ist zwar nützlich, wenn es darum geht, Risiken zu erkennen oder Fehlentwicklungen zu korrigieren. Doch sie kann auch in chronischem Grübeln, Ängsten oder Erschöpfung münden, wenn sie nicht bewusst gesteuert wird.

Die gute Nachricht: Unser Gehirn ist formbar. Es lässt sich trainieren, Positives stärker wahrzunehmen – nicht durch Schönfärberei, sondern durch gezielte Aufmerksamkeit. Wer sich regelmässig kleine Momente der Dankbarkeit bewusst macht oder bei Ärger und Frust den Blick aktiv auf konstruktive Handlungsmöglichkeiten lenkt, kann die Wirkung der Negativitätsverzerrung ausgleichen. Entscheidend ist dabei nicht das Ausblenden des Schlechten, sondern das bewusste Ergänzen durch das Gute.

Die Fähigkeit, Negatives zu verarbeiten, ist zentral für persönliches Wachstum – sofern wir lernen, sie zu nutzen, ohne uns in ihr zu verlieren. Ein bewusster Umgang mit dieser kognitiven Tendenz kann nicht nur unser psychisches Wohlbefinden stärken, sondern auch unser Handeln klarer und wirkungsvoller machen.

Denkanstoss zum Wochenbeginn

„Der angestammte Platz des Moralisten ist und bleibt der verlorene Posten.“ – Erich Kästner (1899–1974)

ProductivityPorn-Tipp der Woche: Klare Ziele setzen

Vage Ziele wie „Ich will produktiver sein“ bringen dich nicht weiter. Setze dir klare, messbare Ziele mit einer Deadline, um gezielt darauf hinzuarbeiten.

Aus dem Archiv: Warum sich Arbeit immer ausdehnt (Parkinsonsche Gesetz)

Kennst du das? Du hast eine Woche Zeit für ein Projekt, und trotzdem findest du dich am Vorabend der Deadline in einem Strudel aus Hektik und Stress wieder. Dieses Phänomen hat einen Namen: das Parkinsonsche Gesetz. Es besagt, dass sich Arbeit stets so ausdehnt, dass sie die verfügbare Zeit vollständig ausfüllt. In diesem Beitrag erkläre ich dir, was hinter diesem Phänomen steckt, wer Parkinson war, der dieses Gesetz aufgestellt hat, und wie du mit ein paar einfachen Strategien verhindern kannst, dass deine Arbeit unnötig in die Länge gezogen wird.

weiterlesen …

Vielen Dank, dass Du Dir die Zeit genommen hast, diesen Newsletter zu lesen. Ich hoffe, die Inhalte konnten Dich inspirieren und Dir wertvolle Impulse für Dein (digitales) Leben geben. Bleib neugierig und hinterfrage, was Dir begegnet!


EpicMind – Weisheiten für das digitale Leben „EpicMind“ (kurz für „Epicurean Mindset“) ist mein Blog und Newsletter, der sich den Themen Lernen, Produktivität, Selbstmanagement und Technologie widmet – alles gewürzt mit einer Prise Philosophie.


Disclaimer Teile dieses Texts wurden mit Deepl Write (Korrektorat und Lektorat) überarbeitet. Für die Recherche in den erwähnten Werken/Quellen und in meinen Notizen wurde NotebookLM von Google verwendet. Das Artikel-Bild wurde mit ChatGPT erstellt und anschliessend nachbearbeitet.

Topic #Newsletter

 
Weiterlesen... Discuss...

from Wayfarer's Quill

There are moments in a wanderer’s life when the road opens unexpectedly, revealing not a new landscape but a deeper layer of the old one. I found myself in such a moment while listening to a quiet reflection from Bishop Robert Barron, spoken in one of his Sunday sermons. His words lingered like a lantern held up to the long corridors of history.

He spoke of Christ not simply as a figure within time, but as the fulcrum upon which time itself turns. We mark our calendars with the quiet acknowledgment of this: B.C., before Christ, and A.D., anno dominiin the year of the Lord. These are not poetic inventions or theological embellishments. They are the way humanity chose to measure its days. The world, knowingly or not, set its clocks by His arrival.

It is a curious thing. If Jesus had been a mere wanderer, a forgotten teacher, or a passing voice among many, the centuries would not have bent around His birth. Time does not rearrange itself for a fraud. Civilizations do not reset their calendars for a nobody. Something happened—something so luminous, so disruptive, so unlike anything before or after—that the human story split in two.

And long before that moment, the prophets whispered of a figure who would come. In the book of Jeremiah, there is a promise spoken into a weary world:

“The days are coming… when I will fulfill the promise I made… In those days Judah shall be saved and Jerusalem shall dwell secure.” —Jeremiah 33:14–16

Bishop Barron noted that Jesus is unique among religious leaders in this way: He was foretold. His coming was not a surprise but a long-awaited dawn. The ancient world leaned forward toward Him, as though creation itself were holding its breath.

As I walked with these thoughts, I felt again that quiet tug—the sense that history is not a flat line but a story with a center. And at that center stands a man who was more than a man, a presence strong enough to steady the axis of time.

For a traveler of quiet roads, it is humbling to remember that even our wandering takes place in the years of the Lord.

#Reflections #ChristInHistory

 
Read more... Discuss...

from sugarrush-77

Today the sermon was great but during my cell group meeting afterwards, I was immediately sucked into an insipid conversation that lasted 1.5 hours. I rolled out of bed finding it difficult to care about anything or anyone, so there’s that, but also some people are really boring. No offense to them, because I’m sure there’s someone out there that finds them interesting, but I find them really boring. And 2 of those people happened to be locked in intense conversation over the most inconsequential, surface level conversation about working visas in front of me, in a situation where I could not get up and leave. I was bored to tears, and annoyed that my afternoon had been wasted in such a way. Next time, I’m saying that I need to meet a friend, and I’m getting up. The last 30-40 minutes of substantial conversation we had at the end did not make up for it in any way, shape or form. Could’ve done without it. Why do we have these again?

I’m in a state of intense despair because I’m pretty sure I have to see these people for the next 6 months to a year. Gonna be like stuffing a sandpaper rod up my asshole.

Sermon was great though. Today I found it difficult to concentrate, but I still got most of it. It jumped through a couple topics kinda like this.

  1. Ask not what God can do for you, but what you can do for God

  2. Living as a witness of Jesus’s death and His coming back to life

  3. Living as a witness part two: you must spread the Good News

Ask not what God can do for you, but what you can do for God

This one pretty much stands on its own, and I spaced out for ten minutes daydreaming of some random bullshit, I bet, because I don’t even remember what I dreamed about.

Living as a witness of Jesus’s death and His coming back to life

In modern Christianity, especially in Korean circles, there’s this made up bullshit of people talking about giving a lot of glory to God through success in this world. We’ve made that up, that kind of statement does NOT exist in the Bible, and the first Christians definitely did not prescribe to that.

The material conditions of the first Christians’ lives did not change remarkably after their conversion to Christ, except when they were carried off to be fed to lions for sport, or killed in various other situations for what they believed in. The change was purely internal, and their behavioral changes were from within. The slaves were still slaves, the working class remained working class. It seems that God rarely rewarded them materially for their obedience, and despite that, they gave their lives for Him, and used their lives to serve others.

This goes against the grain of how society in developed nations are today – individualism is at a record high, and the concept of serving others in love has long since been forgotten. Yet God’s call still remains, and we have forerunners in the faith to look at to remind ourselves of what we should all strive to be like. And the important thing to remember is not how great the apostles were, but to see instead the God that changed their hearts, and transformed them.

Living as as witness part two: you must spread the Good News

  • The Good News is not something you spread when you are ready to spread it, when you’ve properly prepared your heart, when your life is still a mess, and when you’ve finally overcome the sins you’ve been struggling with all your life. If that’s your standard, you’ll never be ready anyways.
  • Spreading the Good News is like spreading breaking news. It doesn’t matter what’s going on in your life right now, you’ve gotta spread it. As long as you’re confident, and you spread it with conviction, you’ve done it right.
  • If your heart is overflowing with joy about Jesus and the Good News, you’ll want to spread it anyways. And if you aren’t preoccupied with this matter, you’ll be preoccupied with other matters of the flesh. And to remain in flesh is to remain in sin, yada yada yada, you know the spiel.
 
더 읽어보기...

from Nerd for Hire

I spent last weekend in Baltimore at my favorite yearly writer party, the annual AWP Conference. I'm not sure if it's just because I took a year off from it in 2025, but this year's at least seemed like the biggest and most active iteration of the conference I've been to post-Covid. The bookfair especially seemed larger than in past years. I wandered through it all three days of the convention and I'm still not entirely convinced I saw all of the tables.

This weekend I've finally had some time to sit down and go through all of the info, books, and swag I picked up from my bookfair tours. I found a surprising number of intriguing new-to-me publishers and organizations this year. I say “surprising” only because I've been to an absurd number of AWPs by this point (Baltimore was my 12th, if I'm doing my math right), and I spend a decent amount of time researching and reading literary journals between cons, too. But that's one of the beautiful things about literary publishing: it's always changing, and there's always something new to discover, no matter how long you spend immersed in the world. 

In any case, here are some literary magazines and other neat things that I'm very glad I know about now. 

The Enthusiast Press

I'm a sucker for a well-made hand-bound book, so I was predictably enamored by The Enthusiast Press. All of their books are hand-bound, unique, and gorgeous. They publish chapbook-length poetry and fiction manuscripts that fall generally under the umbrella “dark-leaning literary.” They're open for sumissions year-round and you can find information on how to send them work on their About page.

Scrawl Place

Something else I'm a sucker for is unique, human-centered travel writing. There are a few great magazines publishing this kind of travelogue-meets-personal-essay kind of stuff, and based on what I've read from them so far I'll be adding Scrawl Place to that list. All the work they publish is connected to a place, but that includes poems and stories alongside essays.

Issues are free to read online and I definitely encourage folks to give them a read. They're also open year-round for general submissions, and currently have a specific call out for work about Chicago (through July 31st).

Scryptid Games

My top panel for the conference was the one I went to on writing for tabletop games, which was led by the crew from Scryptid Games. When I went by their table, I also saw they have a submission call out for Tales from the Cryptids, where they'll publish games, flash fiction, and poetry that tell stories from a cryptid's point-of-view. The call is open through April 30th, for anyone else who's got a story in that category to share. 

Scryptid also publishes some very fun-sounding story-based TTRPGs (Psychic Trash Detectives in particular caught my eye, and is one I might be buying for the group to play in the near future). For anyone else who's been considering making their own games, they have a couple of workshops coming up, including one on Zoom at the end of March. 

Hellbender Magazine

I feel especially well tuned-in to the literary scene of Northern Appalachia, so the fact that I had to travel to Baltimore to find out about Hellbender Magazine I consider to be something of a personal failure. This lovely little literary journal is based in Morgantown, WV, where it's run by graduate students from WVU. It's a revival of the university's previous literary journal, Cheat River Review, and in its new iteration relaunched in the fall of 2023.

Hellbender Magazine publishes flash prose (up to 1,500 words), poetry, and art. They're not open at the moment but I'll be keeping my eye for their next call because I enjoyed what I saw and read from them. 

Books Not Bans

The mission of Books Not Bans is to send free boxes of banned and queer books to people who might otherwise not be able to access them. They work with schools, youth groups, bookmobiles, and other organizations across the United States, largely in rural areas, and have already sent out over 2,100 books in their first year and a half, which is pretty awesome. 

I chatted with the founder for a while in the bookfair and she's super enthusiastic about the mission of making sure everybody has access to quality, diverse literature, no matter where they live. Anyone who's also into that and wants to volunteer can sign up on their website (or any organizations that want to get books can find a form in the FAQ).

Weird Lit Magazine

Anything that has the word “weird” in the title is going to instantly have my attention. Then I saw that Weird Lit Magazine's logo is a sea monster coming out of a planet, and I felt like I'd found my people. They're a quarterly based in the Pacific Northwest and publish online. They just publilshed their 7th issue, so they're still fairly new. You couldn't tell it by reading the issues, though. They're well-designed and fun to read, especially if you enjoy stuff in the slipstream or absurdist category. 

Weird Lit Magazine isn't open at the moment but they'll be opening up on April 15th. When they do, they'll consider fiction up to 3,000 words. They have fairly detailed info on the kind of stuff they're looking for on their submission guidelines.

Silly Goose Press

As an often highly un-serious person myself, I appreciate other literary projects that don't take themselves too seriously. That's the instant vibe I get from Silly Goose Press. Their mission is to publish “craft-forward whimsy”, and you can read their online issues to get a sense for what they mean by that. They started in 2024, so they're still fairly new, but they've published an impressive number of issues given their short history.

Silly Goose Press is currently open for submissions through the end of March. They publish poetry, art, and fiction or creative nonfiction up to 3,000 words. Something I love about their submissions page: they link to other resources for submitters right there, including info on cover letters and a link to ChillSubs. They also have a sample version of their contract available to view, which is a huge green flag for me as an author that the editors have their shit together.

Cola Literary Review

Cola is a new-ish journal from an institution with experience in the lit mag world. It's run by the University of South Carolina's MFA program, which used to run the literary journal and chapbook press Yamassee, which ran from 1993 through when it rebranded in 2022.

I was a few years behind on this rebranding, obviously, but I will say it seems to be a deeper alteration than just a new name. The design of Cola is more modern than I remember Yemassee's being, and they seem like a good home for character-driven literary fiction, based on the recent pieces they have available to read on their website. They're not open for submissions currently but will have a free reading period in September for their next print issue. 

A Reason to Write Retreats and Workshops

I haven't taken a writing retreat in a minute, so I had my eye out for ones that looked interesting as I perused the bookfair. One reason this one stood out is because it's pretty much in my backyard, just down in Harpers Ferry, WV. I've also seen enough of West Virginia to know it's friggin beautiful and would make a wonderful place to get some writing done, so A Reason to Write is definitely on my radar of places to apply.

I also noticed poking around their website that they have some flash workshops coming up in the fall. They also offer 7-day fellowships, up to 5 of them every year, so if you want to take a retreat but the cost is an issue, that could be something to look into.  


This is obviously just a small sampling of the many cool things I saw in the AWP bookfair, but hopefully there's something in there that's a new and exciting find for other folks, too. I'm personally off to send out some submissions and hopefully keep the momentum from the conference rolling. 

See similar posts:

#Conferences #PublishingAdvice #Submissions

 
Leer más...

Join the writers on Write.as.

Start writing or create a blog