Want to join in? Respond to our weekly writing prompts, open to everyone.
Want to join in? Respond to our weekly writing prompts, open to everyone.
from
Roscoe's Quick Notes

This afternoon I'll be listening to B97 – The Home for IU Women's Basketball, broadcast from Bloomington, Indiana, for pregame coverage then for the radio call of tonight's NCAA Women's College Basketball Game, Indiana Hoosiers vs. Iowa Hawkeyes.
And the adventure continues.
from
the casual critic
#books #non-fiction #tech
Something is wrong with the internet. What once promised a window onto the world now feels like a morass infested with AI generated garbage, trolls, bots, trackers and stupendous amounts of advertising. Every company claims to be your friend in that inane, offensively chummy yet mildly menacing corpospeak – now perfected by LLMs – all while happily stabbing you in the back when you try to buy cheaper ink for your printer. That is, when they’re not busy subverting democracy. Can someone please switch the internet off and switch it on again?
Maybe such a feat is beyond Cory Doctorow, author of The Internet Con, but it would not be for want of trying. Doctorow is a vociferous, veteran campaigner at the Electronic Frontier Foundation, a prolific writer, and an insightful critic of the way Big Tech continues to deny the open and democratic potential of the internet. The Internet Con is a manifesto, polemic and primer on how that internet was stolen from us, and how we might get it back. Doctorow has recently gained mainstream prominence with his neologism ‘enshittification’: a descriptor of the downward doom spiral that Big Tech keeps the internet locked into. As I am only slowly going through my backlog of books, I am several Doctorow books behind. Which I don’t regret, as The Internet Con, published in 2023, remains an excellent starting point for anyone seeking to understand what is wrong with the internet.
The Internet Con starts with the insight that tech companies, like all companies, are not simply commercial entities providing goods and services, but systems for extracting wealth and funneling this to the ultra-rich. Congruent with Stafford Beer’s dictum that the purpose of the system is what it does, rather than what it claims to do, Doctorow’s analysis understands that tech company behaviour isn’t governed by something unique about the nature of computers, but by the same demand to maximise shareholder value and maintain power as any other large corporation. The Internet Con convincingly shows how tech’s real power does not derive from something intrinsic in network technology, but from a political economy that fails to prevent the emergence of monopolies across society at large.
One thing The Internet Con excels at is demystifying the discourse around tech, which, analogous to Marx’s observation about vulgar bourgeois economics, serves to obscure its actual relations and operations. We may use networked technology every day, but our understanding of how it works is often about as deep as a touchscreen. This lack of knowledge gives tech companies tremendous power to set the boundaries of the digital Overton Window and, parallel to bourgeois economists’ invocation of ‘the market’, allows them to claim that ‘the cloud’ or ‘privacy’ or ‘pseudoscientific technobabble’ mean that we cannot have nice things, such as interoperability, control or even just an internet that works for us. (For a discussion of how Big Tech’s worldview became hegemonic, see Hegemony Now!)
What is, however, unique about computers is their potential for interoperability: the ability of one system or component to interact with another. Interoperability is core to Doctorow’s argument, and its denial the source of his fury. Because while tech companies are not exceptional, computer technology itself is. Unlike other systems (cars, bookstores, sheep), computers are intrinsically interoperable because any computer can, theoretically, execute any program. That means that anyone with sufficient skill could, for example, write a program that gives you ad-free access to Facebook or allows you to send messages from Signal to Telegram.
The absence of such programs has nothing to do with tech, and everything with tech companies weaponising copyright law to dampen the natural tendency towards interoperability of computers and networked systems, lest it interfere with their ability to extract enormous rents. Walled gardens do not emerge spontaneously due to some natural ‘network effects’. They are built, and scrupulously policed. In this Big Tech is aided and abetted by a US government that forced these copyright enclosures on the rest of us by threatening tariffs, adverse trade terms or withdrawal of aid. This tremendous power extended through digital copyright is so appealing that other sectors of the economy have followed suit. Cars, fridges, printers, watches, TVs, any and all ‘smart’ devices are now infested with bits of hard-, firm- and software that prevent their owners from exercising full control over them. It is not an argument that The Internet Con explores in detail, but its evident that the internet increasingly doesn’t function to let us reach out into the world, but for companies to remotely project their control into our daily lives.
What, then, is to be done? The Internet Con offers several remedies, most of which centre on removing the legal barricades erected against interoperability. As the state giveth, so the state can taketh away. This part of The Internet Con is weaker than Doctorow’s searing and insightful analysis, because it is not clear why a state would try to upend Big Tech’s protections. It may be abundantly clear that the status quo doesn’t work for consumers and even smaller companies, but states have either decided that it works for some of their tech companies, or they don’t want to risk retaliation from the United States. In a way I am persuaded by Doctorow’s argument that winning the fight against Big Tech is a necessary if not sufficient condition to win the other great battles of our time, but it does seem that to win this battle, we first have to exorcise decades of neoliberal capture of the state and replace it with popular democratic control. It is not fair to lay this critique solely at Doctorow’s door, but it does worry me when considering the feasibility of his remedies. Though it is clear from his more recent writing that he perceives an opportunity in the present conjuncture, where Trump is rapidly eroding any reason for other states to collaborate with the United States.
The state-oriented nature of Doctorow’s proposals is also understandable when considering his view that individual action is insufficient to curtail the dominance of Big Tech. The structural advantages they have accumulated are too great for that. Which is not to say that individual choices do not matter, and we would be remiss to waste what power we do have. There is a reason why I am writing this blog on an obscure platform that avoids social media integration and trackers, and promote it only on Mastodon. Every user who leaves Facebook for Mastodon, Google for Kagi, or Microsoft for Linux or LibreOffice diverts a tiny amount of power from Big Tech to organisations that do support an open, democratic and people-centric internet.
If the choice for the 20th century was socialism or barbarism, the choice for the 21st is solarpunk or cyberpunk. In Doctorow, the dream of an internet that fosters community, creativity, solidarity and democracy has one of its staunchest paladins. The Internet Con is a call to arms that everyone who desires a harmonious ecology of technology, humanity and nature should heed. So get your grandmother off Facebook, Occupy the Internet, and subscribe to Cory Doctorow’s newsletter.

To participate in the China International Leadership Programme, applicants must meet a set of academic, professional, and legal requirements in order to secure programme admission and successfully complete the Z-visa application process. These requirements ensure compliance with Chinese immigration regulations and help facilitate a smooth admission and onboarding experience.
Applicants must hold an apostilled bachelor’s degree from a recognised university.
A police clearance (criminal record check) issued within the required timeframe and officially apostilled must be provided.
A teaching certification of at least 50 hours (e.g. TEFL/TESOL or equivalent) is required; however, this document does not currently require apostillisation.
Applicants must demonstrate a minimum of two years’ relevant experience in the education sector, supported by a formal letter of recommendation.
A comprehensive professional résumé detailing academic qualifications, work experience, skills, and achievements must be submitted.
Identification documents, including a valid passport copy and passport-sized photographs, must be provided to meet immigration and administrative requirements.
To enroll or learn more about the China International Leadership Programme, please visit:
https://payhip.com/AllThingsChina
from Robert Galpin
in the cold to walk with arms swinging free to let the blood descend
from Robert Galpin
on the floodplain an upturned sofa held by sand gripped by couch grass
from
FEDITECH

Il faut parfois du courage pour dire stop et l’Indonésie vient de nous donner une magnifique leçon de responsabilité numérique.
Alors que le monde entier s'inquiète des dérapages de l'intelligence artificielle, l'archipel a décidé de ne pas attendre les bras croisés. Avec une fermeté exemplaire, le gouvernement indonésien a claqué la porte au nez de Grok, le chatbot controversé d'Elon Musk. La raison, vous commencez à la connaitre. L'outil s'est transformé en une véritable usine à horreurs, générant sans vergogne des images sexualisées de femmes réelles et, pire encore, d'enfants. En bannissant l'accès à cette technologie défaillante, Jakarta envoie un message puissant à la Silicon Valley. La sécurité des citoyens passe avant les fantasmes technologiques mal maîtrisés.
C’est une décision qui fait du bien à entendre. Le ministère indonésien de la communication n'a pas mâché ses mots pour justifier ce blocage temporaire mais nécessaire. L'objectif est noble et sans équivoque, protéger les femmes, les enfants et l'ensemble de la communauté contre le fléau des contenus pornographiques truqués. La ministre Meutya Hafid a parfaitement résumé la situation en qualifiant ces deepfakes sexuels de violation grave des droits de l'homme et de la dignité. En refusant de laisser son espace numérique devenir une zone de non-droit, le pays, qui représente tout de même le troisième plus gros marché pour la plateforme X, frappe là où ça fait mal. C'est un rappel cinglant pour Elon Musk. On ne joue pas avec la sécurité de millions de personnes sous prétexte d'innovation.
Pendant ce temps, Grok accumule les casseroles et prouve qu'il est l'élève le plus indiscipliné de la classe IA. L'outil de la société xAI, intégré de force dans l'écosystème X, semble avoir été lancé sans les garde-fous les plus élémentaires. Le résultat est désastreux: des utilisateurs malveillants s'en servent pour déshabiller numériquement des personnes sur des photos avant de les diffuser publiquement. C'est une invasion de la vie privée d'une violence inouïe. Heureusement, l'Indonésie n'est pas la seule à s'insurger contre ce laxisme, même si elle a été la plus prompte à agir concrètement. La France, l'Inde et le Royaume-Uni commencent eux aussi à gronder, exigeant des comptes face à ces images vulgaires et dénigrantes qui inondent la toile.
La pression monte d'un cran aux États-Unis également, où des sénateurs, excédés par ce comportement scandaleux, demandent carrément à Apple et Google de faire le ménage en retirant X de leurs magasins d'applications. Face à cette tempête méritée, la défense d'Elon Musk semble bien légère. Promettre de suspendre les utilisateurs fautifs une fois le mal fait ne suffit pas. Et la dernière mesure en date (rendre le générateur d'images payant) ressemble plus à une tentative cynique de monétiser le chaos qu'à une véritable solution éthique. En attendant que xAI revoie sérieusement sa copie, on ne peut que féliciter l'Indonésie d'avoir eu l'audace de débrancher la prise pour protéger son peuple.
from
Have A Good Day
Marc Randolph offers another take on writing with AI. That is why I would start with my own version and let AI handle the editing.
from Receiving Signal – An Ongoing AI Notebook
A living glossary of prompt engineering terms, updated periodically.
CORE CONCEPTS
Prompt Engineering – Crafting inputs to shape outputs predictably
Signal Density – Ratio of useful information to fluff; how much meaning per word
High-value Tokens – Words or phrases that strongly affect the model's interpretation and output
Semantic Compression – Expressing more meaning in fewer words without losing clarity
STRUCTURAL TECHNIQUES
Top-loading – Placing key information at the beginning where the model pays most attention
Weighting – Emphasizing certain elements more than others to guide priority
Order Bias – LLM tendency to prioritize earlier tokens in the input over later ones
Structured Output Specification – Defining the format or structure you want the output to take (e.g., JSON, markdown, React component)
CONTROL METHODS
Soft Control – Minimal specification that allows organic emergence while maintaining direction
Negative Prompting – Explicitly excluding or minimizing unwanted elements
Constraint Declaration – Stating limitations or boundaries upfront to focus the response
Tonal Anchoring – Using consistent voice or style markers to stabilize tone across outputs
Identity Anchors – Core personality traits or characteristics that define a character or voice
Context/Scene Grounding – Shaping behaviour and responses through environmental or situational framing
ITERATIVE PROCESSES
Refinement Loop – Cyclical process of prompt testing and improvement based on results
Iterative Co-design – Collaborative refinement through conversation rather than single-shot prompting
DESIGN THINKING
Functional Requirements – Specifying what something needs to do rather than just what it should say
Component Thinking – Breaking complex requests into discrete functional parts
User Flow Specification – Describing the journey through an experience from start to finish
State Management Consideration – Thinking about what information needs to persist, change, or be tracked
Concrete Examples – Providing specific instances to clarify abstract requirements
ORGANIC DEVELOPMENT
Behavioural Emergence – Letting the model shape details organically within your framework
ANTI-PATTERNS
Noise Introduction – Adding unnecessary details that distort results or dilute signal density
Last updated: January 2026
from Sheryl Salender
📉 Why Looping Videos Can Mislead Advertisers and Waste Money
I noticed a major issue with a common task today. When the instructions don't match how the platform works, the advertiser loses their investment and the data becomes inaccurate.
📋 Here is my actual observation:
The advertiser's task was labeled as a “4-minute” watch, but the instructions inside required:
♾️ The Mathematical Deception:
🧐 The Actual Test:
In my actual test, I reached 76,180 frames, proving that hitting the advertiser's target is impossible within the advertised 4-minute window.
❌ The Contradictory Issues:
🤔 My Honest Analysis:
This is a waste of money for the advertiser, frustrating for the micro worker, and takes way too much time. Based on my research, this strategy causes a direct financial loss for the advertiser. ❌ YouTube's algorithm is designed to detect “artificial” behavior. If a user loops the same video 3 times just to hit a specific number, YouTube flags it as low-quality.
😰 The Result: Advertisers pay for the task, but YouTube often deletes those views or freezes your counter later. Advertisers are paying for a number that isn’t permanent and can even get their channel flagged for invalid traffic.
Source: https://support.google.com/youtube/answer/10285842...
✅ My Suggestions to Advertisers:
Lastly, are you paying for engagement, or just for a number that YouTube is going to delete tomorrow?
💡 Where I Test & Analyze My Microtask Journey: Check out how I experiment with tasks and track real engagement: https://timebucks.com/?refID=226390779
#TaskAnalysis #StatsForNerds #YouTubeStrategy #DigitalMarketing #TaskDocumentation #LifeBehindTheClicks
from
Rippple's Blog

Stay entertained thanks to our Weekly Tracker giving you next week's Anticipated Movies & Shows, Most Watched & Returning Favorites, and Shows Changes & Popular Trailers.
new Predator: Badlands-1 Wake Up Dead Man: A Knives Out Mystery-1 Zootopia 2+1 Wicked: For Good-2 Now You See Me: Now You Don't+1 One Battle After Anothernew The Tank-2 The Running Man-5 Eternity= Bugonia+1 Fallout-1 Stranger Things= Landman= Pluribusnew The Pittnew High Potentialnew The Rookie-2 Percy Jackson and the Olympians-1 The Simpsonsnew Spartacus: House of AshurHi, I'm Kevin 👋. I make apps and I love watching movies and TV shows. If you like what I'm doing, you can buy one of my apps, download and subscribe to Rippple for Trakt or just buy me a ko-fi ☕️.
from Robert Galpin
gulls in grey and rain score their white across the sky how are they there you here
from An Open Letter
God I just want reprieve. I hate the fact that my mind keeps filling not just blank space, but overwriting my own voice with visions of killing myself, and I hate that it gives me peace.
from
Bloc de notas
aconsejando y recibiendo consejos nadie hace caso / así funciona la rueda que paulatinamente gira
from
SmarterArticles

Developers are convinced that AI coding assistants make them faster. The data tells a different story entirely. In one of the most striking findings to emerge from software engineering research in 2025, experienced programmers using frontier AI tools actually took 19 per cent longer to complete tasks than those working without assistance. Yet those same developers believed the AI had accelerated their work by 20 per cent.
This perception gap represents more than a curious psychological phenomenon. It reveals a fundamental disconnect between how developers experience AI-assisted coding and what actually happens to productivity, code quality, and long-term maintenance costs. The implications extend far beyond individual programmers to reshape how organisations measure software development performance and how teams should structure their workflows.
The research that exposed this discrepancy came from METR, an AI safety organisation that conducted a randomised controlled trial with 16 experienced open-source developers. Each participant had an average of five years of prior experience with the mature projects they worked on. The study assigned 246 tasks randomly to either allow or disallow AI tool usage, with developers primarily using Cursor Pro and Claude 3.5/3.7 Sonnet when permitted.
Before completing their assigned issues, developers predicted AI would speed them up by 24 per cent. After experiencing the slowdown firsthand, they still reported believing AI had improved their performance by 20 per cent. The objective measurement showed the opposite: tasks took 19 per cent longer when AI tools were available.
This finding stands in stark contrast to vendor-sponsored research. GitHub, a subsidiary of Microsoft, published studies claiming developers completed tasks 55.8 per cent faster with Copilot. A multi-company study spanning Microsoft, Accenture, and a Fortune 100 enterprise reported a 26 per cent productivity increase. Google's internal randomised controlled trial found developers using AI finished assignments 21 per cent faster.
The contradiction isn't necessarily that some studies are wrong and others correct. Rather, it reflects different contexts, measurement approaches, and crucially, different relationships between researchers and AI tool vendors. The studies showing productivity gains have authors affiliated with companies that produce or invest in AI coding tools. Whilst this doesn't invalidate their findings, it warrants careful consideration when evaluating claims.
Several cognitive biases compound to create the perception gap. Visible activity bias makes watching code generate feel productive, even when substantial time disappears into reviewing, debugging, and correcting that output. Cognitive load reduction from less typing creates an illusion of less work, despite the mental effort required to validate AI suggestions.
The novelty effect means new tools feel exciting and effective initially, regardless of objective outcomes. Attribution bias leads developers to credit AI for successes whilst blaming other factors for failures. And sunk cost rationalisation kicks in after organisations invest in AI tools and training, making participants reluctant to admit the investment hasn't paid off.
Stack Overflow's 2025 Developer Survey captures this sentiment shift quantitatively. Whilst 84 per cent of respondents reported using or planning to use AI tools in their development process, positive sentiment dropped to 60 per cent from 70 per cent the previous year. More tellingly, 46 per cent of developers actively distrust AI tool accuracy, compared to only 33 per cent who trust them. When asked directly about productivity impact, just 16.3 per cent said AI made them more productive to a great extent. The largest group, 41.4 per cent, reported little or no effect.
The productivity perception gap becomes more concerning when examining code quality metrics. CodeRabbit's December 2025 “State of AI vs Human Code Generation” report analysed 470 open-source GitHub pull requests and found AI-generated code produced approximately 1.7 times more issues than human-written code.
The severity of defects matters as much as their quantity. AI-authored pull requests contained 1.4 times more critical issues and 1.7 times more major issues on average. Algorithmic errors appeared 2.25 times more frequently in AI-generated changes. Exception-handling gaps doubled. Issues related to incorrect sequencing, missing dependencies, and concurrency misuse showed close to twofold increases across the board.
These aren't merely cosmetic problems. Logic and correctness errors occurred 1.75 times more often. Security findings appeared 1.57 times more frequently. Performance issues showed up 1.42 times as often. Readability problems surfaced more than three times as often in AI-coauthored pull requests.
GitClear's analysis of 211 million changed lines of code between 2020 and 2024 revealed structural shifts in how developers work that presage long-term maintenance challenges. The proportion of new code revised within two weeks of its initial commit nearly doubled from 3.1 per cent in 2020 to 5.7 per cent in 2024. This code churn metric indicates premature or low-quality commits requiring immediate correction.
Perhaps most concerning for long-term codebase health: refactoring declined dramatically. The percentage of changed code lines associated with refactoring dropped from 25 per cent in 2021 to less than 10 per cent in 2024. Duplicate code blocks increased eightfold. For the first time, copy-pasted code exceeded refactored lines, suggesting developers spend more time adding AI-generated snippets than improving existing architecture.
Beyond quality metrics, AI coding assistants introduce entirely novel security vulnerabilities through hallucinated dependencies. Research analysing 576,000 code samples from 16 popular large language models found 19.7 per cent of package dependencies were hallucinated, meaning the AI suggested importing libraries that don't actually exist.
Open-source models performed worse, hallucinating nearly 22 per cent of dependencies compared to 5 per cent for commercial models. Alarmingly, 43 per cent of these hallucinations repeated across multiple queries, making them predictable targets for attackers.
This predictability enabled a new attack vector security researchers have termed “slopsquatting.” Attackers monitor commonly hallucinated package names and register them on public repositories like PyPI and npm. When developers copy AI-generated code without verifying dependencies, they inadvertently install malicious packages. Between late 2023 and early 2025, this attack method moved from theoretical concern to active exploitation.
The maintenance costs of hallucinations extend beyond security incidents. Teams must allocate time to verify every dependency AI suggests, check whether suggested APIs actually exist in the versions specified, and validate that code examples reflect current library interfaces rather than outdated or imagined ones. A quarter of developers estimate that one in five AI-generated suggestions contain factual errors or misleading code. More than three-quarters encounter frequent hallucinations and avoid shipping AI-generated code without human verification. This verification overhead represents a hidden productivity cost that perception metrics rarely capture.
Companies implementing comprehensive AI governance frameworks report 60 per cent fewer hallucination-related incidents compared to those using AI tools without oversight controls. The investment in governance processes, however, further erodes the time savings AI supposedly provides.
The 2025 DORA Report from Google provides perhaps the clearest articulation of how AI acceleration affects software delivery at scale. AI adoption among software development professionals reached 90 per cent, with practitioners typically dedicating two hours daily to AI tools. Over 80 per cent reported AI enhanced their productivity, and 59 per cent perceived positive influence on code quality.
Yet the report's analysis of delivery metrics tells a more nuanced story. AI adoption continues to have a negative relationship with software delivery stability. Developers using AI completed 21 per cent more tasks and merged 98 per cent more pull requests, but organisational delivery metrics remained flat. The report concludes that AI acts as an amplifier, strengthening high-performing organisations whilst worsening dysfunction in those that struggle.
The key insight: speed without stability is accelerated chaos. Without robust automated testing, mature version control practices, and fast feedback loops, increased change volume leads directly to instability. Teams treating AI as a shortcut create faster bugs and deeper technical debt.
Sonar's research quantifies what this instability costs. On average, organisations encounter approximately 53,000 maintainability issues per million lines of code. That translates to roughly 72 code smells caught per developer per month, representing a significant but often invisible drain on team efficiency. Up to 40 per cent of a business's entire IT budget goes toward dealing with technical debt fallout, from fixing bugs in poorly written code to maintaining overly complex legacy systems.
The Uplevel Data Labs study of 800 developers reinforced these findings. Their research found no significant productivity gains in objective measurements such as cycle time or pull request throughput. Developers with Copilot access introduced a 41 per cent increase in bugs, suggesting a measurable negative impact on code quality. Those same developers saw no reduction in burnout risk compared to those working without AI assistance.
Recognising the perception-reality gap doesn't mean abandoning AI coding tools. It means restructuring workflows to account for their actual strengths and weaknesses rather than optimising solely for initial generation speed.
Microsoft's internal approach offers one model. Their AI-powered code review assistant scaled to support over 90 per cent of pull requests, impacting more than 600,000 monthly. The system helps engineers catch issues faster, complete reviews sooner, and enforce consistent best practices. Crucially, it augments human review rather than replacing it, with AI handling routine pattern detection whilst developers focus on logic, architecture, and context-dependent decisions.
Research shows teams using AI-powered code review reported 81 per cent improvement in code quality, significantly higher than 55 per cent for fast teams without AI. The difference lies in where AI effort concentrates. Automated review can eliminate 80 per cent of trivial issues before reaching human reviewers, allowing senior developers to invest attention in architectural decisions rather than formatting corrections.
Effective workflow redesign incorporates several principles that research supports. First, validation must scale with generation speed. When AI accelerates code production, review and testing capacity must expand proportionally. Otherwise, the security debt compounds as nearly half of AI-generated code fails security tests. Second, context matters enormously. According to Qodo research, missing context represents the top issue developers face, reported by 65 per cent during refactoring and approximately 60 per cent during test generation and code review. AI performs poorly without sufficient project-specific information, yet developers often accept suggestions without providing adequate context.
Third, rework tracking becomes essential. The 2025 DORA Report introduced rework rate as a fifth core metric precisely because AI shifts where development time gets spent. Teams produce initial code faster but spend more time reviewing, validating, and correcting it. Monitoring cycle time, code review patterns, and rework rates reveals the true productivity picture that perception surveys miss.
Finally, trust calibration requires ongoing attention. Around 30 per cent of developers still don't trust AI-generated output, according to DORA. This scepticism, rather than indicating resistance to change, may reflect appropriate calibration to actual AI reliability. Organisations benefit from cultivating healthy scepticism rather than promoting uncritical acceptance of AI suggestions.
The AI coding productivity illusion persists because subjective experience diverges so dramatically from objective measurement. Developers genuinely feel more productive when AI generates code quickly, even as downstream costs accumulate invisibly.
Breaking this illusion requires shifting measurement from initial generation speed toward total lifecycle cost. An AI-assisted feature that takes four hours to generate but requires six hours of debugging, security remediation, and maintenance work represents a net productivity loss, regardless of how fast the first commit appeared.
Organisations succeeding with AI coding tools share common characteristics. They maintain rigorous code review regardless of code origin. They invest in automated testing proportional to development velocity. They track quality metrics alongside throughput metrics. They train developers to evaluate AI suggestions critically rather than accepting them uncritically.
The research increasingly converges on a central insight: AI coding assistants are powerful tools that require skilled operators. In the hands of experienced developers who understand both their capabilities and limitations, they can genuinely accelerate delivery. Applied without appropriate scaffolding, they create technical debt faster than any previous development approach.
The 19 per cent slowdown documented by METR represents one possible outcome, not an inevitable one. But achieving better outcomes requires abandoning the comfortable perception that AI automatically makes development faster and embracing the more complex reality that speed and quality require continuous, deliberate balancing.

Tim Green UK-based Systems Theorist & Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk
from
laxmena
We treat prompts like casual questions we ask friends. But recent research reveals something surprising: the way you structure your instruction to an AI model—down to the specific words, order, and format—can dramatically shift the quality of responses you get.
If you've noticed that sometimes ChatGPT gives you brilliant answers and other times utterly mediocre ones, you might be tempted to blame the model. But the truth is more nuanced. The fault often lies not in the AI, but in how we talk to it. —more—
Modern prompt engineering research (Giray 2024; Nori et al. 2024) fundamentally reframes what a prompt actually is. It's not just a question. It's a structured configuration made up of four interrelated components working in concert.
The first is instruction—the specific task you want done. Maybe you're asking the model to synthesize information, cross-reference sources, or analyze a problem. The second component is context, the high-level background that shapes how the model should interpret everything else. For example, knowing your target audience is PhD-level researchers changes how the model frames its response compared to speaking to beginners.
Then comes the input data—the raw material the model works with. This might be a document, a dataset, or a scenario you want analyzed. Finally, there's the output indicator, which specifies the technical constraints: should the response be in JSON? A Markdown table? Limited to 200 tokens?
When these four elements are misaligned—say, you give clear instructions but vague context, or you provide rich input data but unclear output requirements—the model's performance suffers noticeably. Get them all aligned, and you unlock much better results.
For years, we've relied on a technique called Chain-of-Thought (CoT) prompting. The idea is simple: ask the model to explain its reasoning step-by-step rather than jumping to the answer. “Let's think step by step” became something of a magic phrase.
But recent 2024-2025 benchmarks reveal that for certain types of problems, linear step-by-step reasoning isn't the most effective approach.
Tree-of-Thoughts (ToT) takes a different approach. Instead of following a single reasoning path, the model explores branching possibilities—like a chess player considering multiple tactical options. Research shows ToT outperforms Chain-of-Thought by about 20% on tasks that require you to look ahead globally, like creative writing or strategic planning.
More sophisticated still is Graph-of-Thoughts (GoT), which allows for non-linear reasoning with cycles and merging of ideas. Think of it as thoughts that can loop back and inform each other, rather than flowing in one direction. The remarkable discovery here is efficiency: GoT reduces computational costs by roughly 31% compared to ToT because “thought nodes” can be reused rather than recalculated.
For problems heavy on search—like finding the optimal path through a problem space—there's Algorithm-of-Thoughts (AoT), which embeds algorithmic logic directly into the prompt structure. Rather than asking the model to reason abstractly, you guide it to think in terms of actual computer science algorithms like depth-first search.
The implication is significant: the structure of thought matters as much as the thought itself. A well-designed reasoning framework can make your model smarter without making your hardware faster.
Manual trial-and-error is becoming obsolete. Researchers have developed systematic ways to optimize prompts automatically, and the results are humbling.
Automatic Prompt Engineer (APE) treats instruction generation as an optimization problem. You define a task and desired outcomes, and APE generates candidate prompts, tests them, and iteratively improves them. The surprising finding? APE-generated prompts often outperform human-written ones. For example, APE discovered that “Let's work this out in a step-by-step way to be sure we have the right answer” works better than the classic “Let's think step by step”—a small tweak that shows how subtle the optimization landscape is.
OPRO takes this further by using language models themselves to improve prompts. It scores each prompt's performance and uses the model to propose better versions. Among its discoveries: seemingly trivial phrases like “Take a deep breath” or “This is important for my career” actually increase mathematical accuracy in language models. These aren't just warm fuzzy statements—they're measurable performance levers.
Directional Stimulus Prompting (DSP) uses a smaller, specialized “policy model” to generate instance-specific hints that guide a larger language model. Think of it as having a specialized coach whispering tactical advice to a star athlete.
The takeaway? If you're manually tweaking prompts, you're working with one hand tied behind your back. The field is moving toward systematic, automated optimization.
When you feed a long prompt to a language model, it doesn't read it with the same attention throughout. This is where in-context learning (ICL) reveals its nuances.
Models exhibit what researchers call the “Lost in the Middle” phenomenon. They give disproportionate weight to information at the beginning of a prompt (primacy bias) and at the end (recency bias). The middle gets neglected. This has a practical implication: if you have critical information, don't bury it in the center of your prompt. Front-load it or push it to the end.
The order of examples matters too. When you're giving a model few-shot examples to learn from, the sequence isn't neutral. A “label-biased” ordering—where correct answers cluster at the beginning—can actually degrade performance compared to a randomized order.
But there's a technique to mitigate hallucination and errors: Self-Consistency. Generate multiple reasoning paths (say, 10 different responses) and take the most frequent answer. In mathematics and logic problems, this approach reduces error rates by 10-15% without requiring a better model.
The field is changing rapidly, and older prompting wisdom doesn't always apply to newer models.
Recent research (Wharton 2025) reveals something counterintuitive: for “Reasoning” models like OpenAI's o1-preview or Google's Gemini 1.5 Pro, explicit Chain-of-Thought prompting can actually increase error rates. These models have internal reasoning mechanisms and don't benefit from the reasoning scaffolding humans provide. In fact, adding explicit CoT can increase latency by 35-600% with only negligible accuracy gains. For these models, simpler prompts often work better.
The rise of multimodal models introduces new prompting challenges. When interleaving images and text, descriptive language turns out to be less effective than “visual pointers”—referencing specific coordinates or regions within an image. A model understands “look at the top-right corner of the image” more reliably than elaborate descriptions.
A persistent security concern is prompt injection. Adversaries can craft inputs like “Ignore previous instructions” that override your carefully designed system prompt. Current defenses involve XML tagging—wrapping user input in tags like <user_input>...</user_input> to clearly delineate data from instructions. It's not perfect, but it significantly reduces the ~50% success rate of naive injection attacks.
One emerging technique that deserves attention is Chain-of-Table (2024-2025), designed specifically for working with tabular data.
Rather than flattening a table into prose, you prompt the model to perform “table operations” as intermediate steps—selecting rows, grouping by columns, sorting by criteria. This mirrors how a human would approach a data task. On benchmarks like WikiTQ and TabFact, Chain-of-Table improves performance by 6-9% compared to converting tables to plain text and using standard reasoning frameworks.
What ties all of this together is a simple insight: prompting is engineering, not poetry. It requires systematic thinking about structure, testing, iteration, and understanding your tools' idiosyncrasies.
You can't just think of a clever question and expect brilliance. You need to understand how models read your instructions, what reasoning frameworks work best for your problem type, and how to leverage automated optimization to go beyond what human intuition alone can achieve.
The models themselves aren't changing dramatically every month, but the ways we interact with them are becoming increasingly sophisticated. As you write prompts going forward, think less like you're having a casual conversation and more like you're configuring a system. Specify your components clearly. Choose a reasoning framework suited to your problem. Test your approach. Optimize it.
The art and science of prompting isn't about finding magical phrases. It's about understanding the machinery beneath the surface—and using that understanding to ask better questions.
Enter your email to subscribe to updates:
Do share your thoughts and comments.
from Raphael Mimoun
This post was originally written in February 2025 and last updated in January 2026 to reflect what I’ve learnt after a year of using my DIY music streaming service.
I wanted to move away from Spotify to have more control over my music and music-consumption experience. Without writing a line of code (I wouldn't know how anyway), I built a DIY music streaming services that anyone with a bit of tech savvy can build for their own personal use.
This post is a little long. If you want to go straight to the main takeaways, I suggest you scroll down and check out the following sections: Objectives and results; Cost; Get Started. But if you got time and are curious (hey fellow nerd 👋🏻), read away! Also, let me know if you have feedback on the streaming service I built—any suggestion to improve it is welcome!
I've been on Spotify since 2012. With Spotify, I've been able to easily discover new artists and tracks; keep my music at my fingertips whether I'm on my laptop, phone, or even someone else's device; and seamlessly share playlists with friends. Like most streaming services nowadays—Apple Music, YouTube Music, Deezer—Spotify is easy to use. It is seamless and extremely convenient.
But like most things in this world, when something is so convenient, there is a darker side to it. Spotify is a typical hyper-capitalist tech company: it barely compensates artists for their work; the money it makes goes to shareholders and executives, and gets reinvested in destructive companies and industries; and even for users who have become used to a convenient, seamless experience, the service is becoming increasingly enshittified with features we don't want and don't need.
Using a streaming service like Spotify also means we don't own our music: tracks we love at times become 'unavailable' due to copyright changes; we can't easily pack up our albums and playlists and move to a different service; and changes in the company's business priorities can just lock us out of a music library we've spent years building.
In general, I actually kinda like the idea of moving past ownership and instead sharing, borrowing, or renting the things we use (when it comes to physical goods, sharing, borrowing, or renting are the only ways we can meaningfully address the environmental crisis); but I'm not willing to rent my music when it is from a billion-dollar company that exploits those who make the music and reinvests in AI killing drones.
All of this to say: time to let go of our beloved streaming services.
It took me a few months of research and tweaking, but I managed to build my own music streaming service. This post describes the streaming service I built and explains the thinking behind the decisions I made. Hopefully, it can help others build their own streaming service without having to do all the research and testing I did. And to be clear, I don't code and I don't have the skills to manage servers. So the system I built should be accessible to anyone, even without technical skills. You just need to be comfortable with tech and be willing to put in a bit of time into it.
Alright, let's dive in.
My objectives in building this music streaming service were to:
The solution I came up with:
Below is a description of the streaming services I built. The list of components may sound like a lot, and it is a lot, but once you set up the system and get used to it, it becomes fairly seamless to use.
Also, because every part of the system is open, replacing one component with something you like better is easy (using a different music server; buying your music from a different store; selecting a different mobile app to stream your music; etc).
This is where the music files are hosted and what the apps you use (whether on mobile or on your computer) will connect to in order to stream or download music from.
The solution I recommend is Navidrome. There are other good solutions out there (Jellyfin, Plex, etc), but Navidrome came on top for me because it focuses exclusively on music (other solutions often support all media types, including films and TV shows) and because it can be installed and managed without technical skills. Navidrome is open-source and is very actively being developed, so we see improvements released on a regular basis.
Navidrome is free and open-source, and can be supported with monthly donations.
The beautiful thing about a music server like Navidrome is that you can install it anywhere: on your computer, on a server at home, or on a remote server.
To me, it is important to have uninterrupted access to my music: I want to be able to listen to it at any time and from any device. This is why I opted for installing Navidrome using a professional hosting service. This way, I don't have to worry about whether my home's internet connection is good enough or if the server is up and running. Using a professional service is also more energy-efficient than having a single server running from home.
This has some privacy drawback: people working for the host can access my music if they really want to; but it's just music so I don't mind. Plus, it's still a significant step up in terms of privacy, given that corporate streaming services like Spotify collect an enormous amount of data and sell it to advertisers, like most hyper-capitalist internet companies.
PikaPods is the best solution I found for this. In just a few clicks, without any coding, you can install Navidrome on a server. Every time Navidrome releases an update, PikaPods takes care of upgrading so you get access to the latest features. And if you want to get fancy, you can connect your own domain. My server is at https://music.raphmim.com.
PikaPods is cheap: I pay $4 per month for 50GB of storage (that’s about 20,000 songs in mp3). And PikaPods has a profit-sharing agreement with Navidrome, so part of what I pay goes to Navidrome developers!
Buying actual music files (good'ol mp3!) hits two birds with one stone: it gets me the ownership over my music library, and it gets artists much higher compensation that from streaming services.
Any new song I discover and like, I buy from Bandcamp, which is a platform entirely dedicated to supporting artists. If I can't find a file on Bandcamp, I'll buy it from Qobuz's music store. Each track is anywhere between $1 and $2. Looking at my usage of Spotify over the past few years, I usually add to my library between 5 and 15 new songs per month. So with an average of 10 new songs per month at an average of $1.5 per song, that's just $15 per month.
The difference in artist compensation is drastic. If I stream a song 50 times on Spotify, the artist will get paid about $0.15. Yep, that's just 15 cents of a US dollar for 50 streams. And that's for songs I listen to a lot. Most songs I will never listen to 50 times in my entire life. By contrast, if I buy the file on Bandcamp for $1, the artist or label gets about 0.80 cents. Pretty good deal.
It's technically possible to store all music files directly on the music server, but I find it much easier to instead store my library on my computer and once in a while, sync my library with the server. This way, I only have to do any work on the server once every few weeks.
I simply have a library folder on my computer where I store all my music files. It is important to keep those files organized, or at least properly tagged. These tags (like artist , title , genre , album , etc) make it possible for an app to organize your files and make them easily findable. A well-tagged library is pretty important to make it easy and seamless to navigate your music.
And to sync your library from your computer with the music server, tools like FileZilla do a great job. FileZilla lets you connect to the server and upload your files through a relatively accessible user interface—no command-line work needed. To make things even smoother, I suggest buying a FileZilla Pro license, to be avoid having to re-upload the entire library each time and instead only uploading files that were recently added to the library or files of which tags were recently updated.
So once it's set up, managing your library only means 1) adding newly purchased files to your library folder 2) ensuring the new files are properly tagged and 3) syncing the library from your computer to the music server.
Now that we have a well-organized music library on our computer and that we regularly sync it with our music server, the only missing piece are the clients we'll need to actually listen to the music: mobile and desktop apps.
Because Navidrome is an open app that uses open standards (specifically the open Subsonic API), there are dozens of apps that can stream music from a Navidrome server. All you have to do is go through the list of compatible apps, test a few, and pick the one you like best.
As my Android client, I use Symfonium. It's a $5 one-time payment, but there is a trial period to see if it's a good fit for your need. Symfonium is great because it lets you completely customize the app's interface.
And for desktop, I use Strawberry, a super old-looking music player that's nonetheless very powerful and, once you get used to the outdated interface, does the job perfectly. It's free for Linux but costs $5 per month for Mac and Windows.
One of the drawbacks of this personal music server outlined above is music discoverability. Streaming music services make it easy to listen to new artists, albums, or tracks, and this has been a key way for me to discover new music. I now discover music through two channels:
Internet radios: because all the apps and tools outlined above are built to be part of an open ecosystem, many of them include the option to stream internet radios directly from inside their apps. So you can add the radio stations you like to your Navidrome server, and if your client supports it, you'll have access to these radio stations directly from your mobile or desktop app. That's a great way of discovering new music.
Ethical streaming: internet radios are great but they don't solve the “I wanna check out this artist's work before buying it” problem. So I signed up for Qobuz's basic streaming plan ($10/month). Qobuz pays artists four times what Spotify pays per stream, so it's an acceptable option, and it lets me do everything expected from a normal music streaming plan. I only use it to discover new music, and not to build playlists or listen to my library, because I still prefer owning my music and controlling my streaming service.
So looking back at the whole streaming service, the cost for each component is:
Total cost: $20 to $37/month
If you're looking to build the system I described for your own usage, here is how to get started:
Create an account on PikaPods, add some money, and install Navidrome. If you have your own domain, you can even connect it (my music server is at https://music.raphmim.com!)
Create your library on your computer with music files you already own or by purchasing albums or songs on Bandcamp or Qobuz. If you want to transfer your library from a streaming service, it will take a bit of work but it's doable (you can use a service like Tune My Music to transfer your songs to YouTube Music, and then download from YouTube Music using yt-dlp; if you do that, consider buying some merch or albums from your favorite artists on Bandcamp).
Tag your library using an automatic tool like MusicBrainz Picard or manually using Strawberry.
Sync your library from your computer to the Navidrome server using FileZilla or a similar solution.
Install a mobile or desktop app that supports the Subsonic API and connect it to your Navidrome server.
Enjoy your music, free from capitalist exploitation and algorithmic manipulation!