Want to join in? Respond to our weekly writing prompts, open to everyone.
Want to join in? Respond to our weekly writing prompts, open to everyone.
from digital ash
I guess the lazy dog jumped over the quick brown fox for once…
from Stefan Angrick
This post is part of a four-part summary of Google's Machine Learning Crash Course. For context, check out this post.
The model is only a small part of real-world production ML systems. It often represents only 5% or less of the total codebase in the system.

Source: Production ML systems | Machine Learning | Google for Developers
Machine learning models can be trained statically (once) or dynamically (continuously).
| Static training (offline training) | Dynamic training (online training) | |
|---|---|---|
| Advantages | Simpler. You only need to develop and test the model once. | More adaptable. Keeps up with changes in data patterns, providing more accurate predictions. |
| Disadvantages | Sometimes stale. Can become outdated if data patterns change, requiring data monitoring. | More work. You must build, test, and release a new product continuously. |
Choosing between static and dynamic training depends on the specific dataset and how frequently it changes.
Monitoring input data is essential for both static and dynamic training to ensure reliable predictions.
Source: Production ML systems: Static versus dynamic training | Machine Learning | Google for Developers
Inference involves using a trained model to make predictions on unlabelled examples, and it can be done as follows:
Static inference (offline inference, batch inference) generates predictions in advance and caches them, which suits scenarios where prediction speed is critical.
Dynamic inference (online inference, real-time inference) generates predictions on demand, offering flexibility for diverse inputs.
| Static inference (offline inference, batch inference) | Dynamic inference (online inference, real-time inference) | |
|---|---|---|
| Advantages | No need to worry about cost of inference; allows post-verification of predictions before pushing | Can infer a prediction on any new item as it comes in |
| Disadvantages | Limited ability to handle uncommon inputs | Compute-intensive and latency-sensitive; monitoring needs are intensive |
Choosing between static and dynamic inference depends on factors such as model complexity, desired prediction speed, and the nature of the input data.
Static inference is advantageous when cost and prediction verification are prioritised, while dynamic inference excels in handling diverse, real-time predictions.
Source: Production ML systems: Static versus dynamic inference | Machine Learning | Google for Developers
Feature engineering can be performed before or during model training, each with its own advantages and disadvantages.
Source: Production ML systems: When to transform data? | Machine Learning | Google for Developers
Deploying a machine learning model involves validating data, features, model versions, serving infrastructure, and pipeline integration.
Reproducible model training involves deterministic seeding, fixed initialisation order, averaging multiple runs, and using version control.
Integration tests ensure that different components of the ML pipeline work together seamlessly and should run continuously and for new model or software versions.
Before serving a new model, validate its quality by checking for sudden and gradual degradations against previous versions and fixed thresholds.
Ensure model-infrastructure compatibility by staging the model in a sandboxed server environment to avoid dependency conflicts.
Source: Production ML systems: Deployment testing | Machine Learning | Google for Developers
ML pipeline monitoring involves validating data (using data schemas) and features (using unit tests), tracking real-world metrics, and addressing potential biases in data slices.
Monitoring training-serving skew, label leakage, model age, and numerical stability is crucial for maintaining pipeline health and model performance.
Live model quality testing uses methods such as human labelling and statistical analysis to ensure ongoing model effectiveness in real-world scenarios.
Implementing proper randomisation through deterministic data generation enables reproducible experiments and consistent analysis.
Maintaining invariant hashing ensures that data splits remain consistent across experiments, contributing to reliable analysis and model evaluation.
Source: Production ML systems: Monitoring pipelines | Machine Learning | Google for Developers
Continuously monitor models in production to evaluate feature importance and potentially remove unnecessary features, ensuring prediction quality and resource efficiency.
Data reliability is crucial. Consider data source stability, potential changes in upstream data processes, and the creation of local data copies to control versioning and mitigate risks.
Be aware of feedback loops, where a model's predictions influence future input data, potentially leading to unexpected behaviour or biased outcomes, especially in interconnected systems.
Source: Production ML systems: Questions to ask | Machine Learning | Google for Developers
AutoML automates tasks in the machine learning workflow, such as data engineering (feature selection and engineering), training (algorithm selection and hyperparameter tuning), and analysis, making model building faster and easier.

While manual training involves writing code and iteratively adjusting it, AutoML reduces repetitive work and the need for specialised skills.
Source: Automated Machine Learning (AutoML) | Google for Developers
Benefits:
Limitations:
Large amounts of data are generally required for AutoML, although specialised systems using transfer learning (taking a model trained on one task and adapting its learned representations to a different but related task) can reduce this requirement.
AutoML suits teams with limited ML experience or those seeking productivity gains without customisation needs. Custom (manual) training suits cases where model quality and customisation matter most.
Source: AutoML: Benefits and limitations | Machine Learning | Google for Developers
AutoML tools fall into two categories:
The AutoML workflow follows steps similar to traditional machine learning, including problem definition, data gathering, preparation, model development, evaluation, and potential retraining.
Data preparation is crucial for AutoML and involves labelling, cleaning and formatting data, and applying feature transformations.
No-code AutoML tools guide users through model development with steps such as data import, analysis, refinement, and configuration of run parameters before starting the automated training process.
Source: AutoML: Getting started | Machine Learning | Google for Developers
Before putting a model into production, it is critical to audit training data and evaluate predictions for bias.
Source: Fairness | Machine Learning | Google for Developers
Machine learning models can be susceptible to bias due to human involvement in data selection and curation.
Understanding common human biases is crucial for mitigating their impact on model predictions.
Types of bias include reporting bias, historical bias, automation bias, selection bias, coverage bias, non-response bias, sampling bias, group attribution bias (in-group bias and out-group homogeneity bias), implicit bias, confirmation bias, and experimenter's bias, among others.
Source: Fairness: Types of bias | Machine Learning | Google for Developers
Missing or unexpected feature values in a dataset can indicate potential sources of bias.
Data skew, where certain groups are under- or over-represented, can introduce bias and should be addressed.
Evaluating model performance by subgroup ensures fairness and equal performance across different characteristics.
Source: Fairness: Identifying bias | Machine Learning | Google for Developers
Machine learning engineers use two primary strategies to mitigate bias in models:
Augmenting training data involves collecting additional data to address missing, incorrect, or skewed data, but it can be infeasible due to data availability or resource constraints.
Adjusting the model's loss function involves using fairness-aware optimisation functions rather than the common default log loss.
The TensorFlow Model Remediation Library provides optimisation functions designed to penalise errors in a fairness-aware manner:
Source: Fairness: Mitigating bias | Machine Learning | Google for Developers
Aggregate model performance metrics such as precision, recall, and accuracy can hide biases against minority groups.
Fairness in model evaluation involves ensuring equitable outcomes across different demographic groups.
Fairness metrics can help assess model predictions for bias.
Candidate pool of 100 students: 80 students belong to the majority group (blue), and 20 students belong to the minority group (orange):

Source: Fairness: Evaluating for bias | Machine Learning | Google for Developers
Demographic parity aims to ensure equal acceptance rates for majority and minority groups, regardless of individual qualifications.
Both the majority (blue) and minority (orange) groups have an acceptance rate of 20%:

While demographic parity promotes equal representation, it can overlook differences in individual qualifications within each group, potentially leading to unfair outcomes.
Qualified students in both groups are shaded in green, and qualified students who were rejected are marked with an X:

Majority acceptance rate = Qualified majority accepted / Qualified majority = 16/35 = 46% Minority acceptance rate = Qualified minority accepted / Qualified minority = 4/15 = 27%
When the distribution of a preferred label (“qualified”) differs substantially between groups, demographic parity may not be the most appropriate fairness metric.
There may be additional benefits/drawbacks of demographic parity not discussed here that are also worth considering.
Source: Fairness: Demographic parity | Machine Learning | Google for Developers
Equality of opportunity focuses on ensuring that qualified individuals have an equal chance of acceptance, regardless of demographic group.
Qualified students in both groups are shaded in green:

Majority acceptance rate = Qualified majority accepted / Qualified majority = 14/35 = 40% Minority acceptance rate = Qualified minority accepted / Qualified minority = 6/15 = 40%
Equality of opportunity has limitations, including reliance on a clearly defined preferred label and challenges in settings that lack demographic data.
It is possible for a model to satisfy both demographic parity and equality of opportunity under specific conditions where positive prediction rates and true positive rates align across groups.
Source: Fairness: Equality of opportunity | Machine Learning | Google for Developers
Counterfactual fairness evaluates fairness by comparing predictions for similar individuals who differ only in a sensitive attribute such as demographic group.
This metric is particularly useful when datasets lack complete demographic information for most examples but contain it for a subset.
Candidate pool, with demographic group membership unknown for most candidates (icons shaded in grey):

Counterfactual fairness may not capture broader systemic biases across subgroups. Other fairness metrics, such as demographic parity and equality of opportunity, provide a more holistic view but may require complete demographic data.
Summary
Selecting the appropriate fairness metric depends on the specific application and desired outcome, with no single “right” metric universally applicable.
For example, if the goal is to achieve equal representation, demographic parity may be the optimal metric. If the goal is to achieve equal opportunity, equality of opportunity may be the best metric.
Some definitions of fairness are mutually incompatible.
Source: Fairness: Counterfactual fairness | Machine Learning | Google for Developers
from Stefan Angrick
This post is part of a four-part summary of Google's Machine Learning Crash Course. For context, check out this post.
Neural networks are a model architecture designed to automatically identify non-linear patterns in data, eliminating the need for manual feature cross experimentation.
Source: Neural networks | Machine Learning | Google for Developers
In neural network terminology, additional layers between the input layer and the output layer are called hidden layers, and the nodes in these layers are called neurons.

Source: Neural networks: Nodes and hidden layers | Machine Learning | Google for Developers
Each neuron in a neural network performs the following two-step action:
Common activation functions include sigmoid, tanh, and ReLU.
The sigmoid function maps input x to an output value between 0 and 1:
$$
F(x) = \frac{1}{1 + e^{-x}}
$$

The tanh function (short for “hyperbolic tangent”) maps input x to an output value between -1 and 1:
$$
F(x) = \tanh{(x)}
$$

The rectified linear unit activation function (or ReLU, for short) applies a simple rule:
ReLU often outperforms sigmoid and tanh because it reduces vanishing gradient issues and requires less computation.

A neural network consists of:
Source: Neural networks: Activation functions | Machine Learning | Google for Developers
Backpropagation is the primary training algorithm for neural networks. It calculates how much each weight and bias in the network contributed to the overall prediction error by applying the chain rule of calculus. It works backwards from the output layer to tell the gradient descent algorithm which equations to adjust to reduce loss.
In practice, this involves a forward pass, where the network makes a prediction and the loss function measures the error, followed by a backward pass that propagates that error back through the layers to compute gradients for each parameter.
Best practices for neural network training:
Source: Neural Networks: Training using backpropagation | Machine Learning | Google for Developers
Multi-class classification models predict from multiple possibilities (binary classification models predict just two).
Multi-class classification can be achieved through two main approaches:
One-vs.-all uses multiple binary classifiers, one for each possible outcome, to determine the probability of each class independently.

This approach is fairly reasonable when the total number of classes is small.
We can create a more efficient one-vs.-all model with a deep neural network in which each output node represents a different class.

Note that the probabilities do not sum to 1. With a one-vs.-all approach, the probability of each binary set of outcomes is determined independently of all the other sets (the sigmoid function is applied to each output node independently).
One-vs.-one (softmax) predicts probabilities of each class relative to all other classes, ensuring all probabilities sum to 1 using the softmax function in the output layer. It assigns decimal probabilities to each class such that all probabilities add up to 1.0. This additional constraint helps training converge more quickly.
Note that the softmax layer must have the same number of nodes as the output layer.

The softmax formula extends logistic regression to multiple classes: $$ p(y = j|\textbf{x}) = \frac{e^{(\textbf{w}_j^{T}\textbf{x} + b_j)}}{\sum_{k\in K} e^{(\textbf{w}_k^{T}\textbf{x} + b_k)}} $$
Full softmax is fairly cheap when the number of classes is small but can become computationally expensive with many classes.
Candidate sampling offers an alternative for increased efficiency. It computes probabilities for all positive labels but only a random sample of negative labels. For example, if we are interested in determining whether an input image is a beagle or a bloodhound, we do not have to provide probabilities for every non-dog example.
One label versus many labels
Softmax assumes that each example is a member of exactly one class. Some examples, however, can simultaneously be a member of multiple classes. For multi-label problems, use multiple independent logistic regressions instead.
Example: To classify dog breeds from images, including mixed-breed dogs, use one-vs.-all, since it predicts each breed independently and can assign high probabilities to multiple breeds, unlike softmax, which forces probabilities to sum to 1.
Source: Neural networks: Multi-class classification | Machine Learning | Google for Developers
Embeddings are lower-dimensional representations of sparse data that address problems associated with one-hot encodings.
A one-hot encoded feature “meal” of 5,000 popular meal items:

This representation of data has several problems:
Embeddings, lower-dimensional representations of sparse data, address these issues.
Source: Embeddings | Machine Learning | Google for Developers
Embeddings are low-dimensional representations of high-dimensional data, often used to capture semantic relationships between items.
Embeddings place similar items closer together in the embedding space, allowing for efficient machine learning on large datasets.
Example of a 1D embedding of a sparse feature vector representing meal items:

2D embedding:

3D embedding:

Distances in the embedding space represent relative similarity between items.
Real-world embeddings can encode complex relationships, such as those between countries and their capitals, allowing models to detect patterns.
In practice, embedding spaces have many more than three dimensions, although far fewer than the original data, and the meaning of individual dimensions is often unclear.
Embeddings usually are task-specific, but one task with broad applicability is predicting the context of a word.
Static embeddings like word2vec represent all meanings of a word with a single point, which can be a limitation in some cases. When each word or data point has a single embedding vector, this is called a static embedding.
word2vec can refer both to an algorithm for obtaining static word embeddings and to a set of word vectors that were pre-trained with that algorithm.
Source: Embeddings: Embedding space and static embeddings | Machine Learning | Google for Developers
Embeddings can be created using dimensionality reduction techniques such as PCA or by training them as part of a neural network.
Training an embedding within a neural network allows customisation for specific tasks, where the embedding layer learns optimal weights to represent data in a lower-dimensional space, but it may take longer than training the embedding separately.
In general, you can create a hidden layer of size d in your neural network that is designated as the embedding layer, where d represents both the number of nodes in the hidden layer and the number of dimensions in the embedding space.

Word embeddings, such as word2vec, leverage the distributional hypothesis to map semantically similar words to geometrically close vectors. However, such static word embeddings have limitations because they assign a single representation per word.
Contextual embeddings offer multiple representations based on context. For example, “orange” would have a different embedding for every unique sentence containing the word in the dataset (as it could be used as a colour or a fruit).
Contextual embeddings encode positional information, while static embeddings do not. Because contextual embeddings incorporate positional information, one token can have multiple contextual embedding vectors. Static embeddings allow only a single representation of each token.
Methods for creating contextual embeddings include ELMo, BERT, and transformer models with a self-attention layer.
Source: Embeddings: Obtaining embeddings | Machine Learning | Google for Developers
A language model estimates the probability of a token or sequence of tokens given surrounding text, enabling tasks such as text generation, translation, and summarisation.
Tokens, the atomic units of language modelling, represent words, subwords, or characters and are crucial for understanding and processing language.
Example: “unwatched” would be split into three tokens: un (the prefix), watch (the root), ed (the suffix).
N-grams are ordered sequences of words used to build language models, where N is the number of words in the sequence.
Short N-grams capture too little information, while very long N-grams fail to generalise due to insufficient repeated examples in training data (sparsity issues).
Recurrent neural networks improve on N-grams by processing sequences token by token and learning which past information to retain or discard, allowing them to model longer dependencies across sentences and gain more context.
Model performance depends on training data size and diversity.
While recurrent neural networks improve context understanding compared to N-grams, they have limitations, paving the way for the emergence of large language models that evaluate the whole context simultaneously.
Source: Large language models | Machine Learning | Google for Developers
Large language models (LLMs) predict sequences of tokens and outperform previous models because they use far more parameters and exploit much wider context.
Transformers form the dominant architecture for LLMs and typically combine an encoder that converts input text into an intermediate representation with a decoder that generates output text, for example translating between languages.

Partial transformers
Encoder-only models focus on representation learning and embeddings (which may serve as input for another system), while decoder-only models specialise in generating long sequences such as dialogue or text continuations.
Self-attention allows the model to weigh the importance of different words in relation to each other, enhancing context understanding.
Example: “The animal didn't cross the street because it was too tired.”
The self-attention mechanism determines the relevance of each nearby word to the pronoun “it”. The bluer the line, the more important that word is to the pronoun it. As shown, “animal” is more important than “street” to the pronoun “it”.

Multi-head multi-layer self-attention
Each self-attention layer contains multiple self-attention heads. The output of a layer is a mathematical operation (such as a weighted average or dot product) of the outputs of the different heads.
A complete transformer model stacks multiple self-attention layers. The output from one layer becomes the input for the next, allowing the model to build increasingly complex representations, from basic syntax to more nuanced concepts.
Self-attention is an O(N^2 * S * D) problem.
LLMs are trained using masked predictions on massive datasets, enabling them to learn patterns and generate text based on probabilities. You probably will never train an LLM from scratch.
Instruction tuning can improve an LLM's ability to follow instructions.
Why transformers are so large
This course generally recommends building models with a smaller number of parameters, but research shows that transformers with more parameters consistently achieve better performance.
Text generation
LLMs generate text by repeatedly predicting the most probable next token, effectively acting as highly powerful autocomplete systems. You can think of a user's question to an LLM as the “given” sentence followed by a masked response.
Benefits and problems
While LLMs offer benefits such as clear text generation, they also present challenges.
Source: LLMs: What's a large language model? | Machine Learning | Google for Developers
General-purpose LLMs, also known as foundation LLMs, base LLMs, or pre-trained LLMs, are pre-trained on vast amounts of text, enabling them to understand language structure and generate creative content, but they act as platforms rather than complete solutions for tasks such as classification or regression.
Fine-tuning updates the parameters of a model to improve its performance on a specialised task, improving prediction quality.
Distillation aims to reduce model size, typically at the cost of some prediction quality.
Prompt engineering allows users to customise an LLM's output by providing examples or instructions within the prompt, leveraging the model's existing pattern-recognition abilities without changing its parameters.
One-shot, few-shot, and zero-shot prompting differ by how many examples the prompt provides, with more examples usually improving reliability by giving clearer context.
Prompt engineering does not alter the model's parameters. Prompts leverage the pattern-recognition abilities of the existing LLM.
Offline inference pre-computes and caches LLM predictions for tasks where real-time response is not critical, saving resources and enabling the use of larger models.
Responsible use of LLMs requires awareness that models inherit biases from their training and distillation data.
Source: LLMs: Fine-tuning, distillation, and prompt engineering | Machine Learning | Google for Developers
from Stefan Angrick
This post is part of a four-part summary of Google's Machine Learning Crash Course. For context, check out this post.
Numerical data: Integers or floating-point values that behave like numbers. They are additive, countable, ordered, and so on. Examples include temperature, weight, or the number of deer wintering in a nature preserve.
Source: Working with numerical data | Machine Learning | Google for Developers
A machine learning model ingests data through floating-point arrays called feature vectors, which are derived from dataset features. Feature vectors often utilise processed or transformed values instead of raw dataset values to enhance model learning.
Example of a feature vector: [0.13, 0.47]
Feature engineering is the process of converting raw data into suitable representations for the model. Common techniques are:
Non-numerical data like strings must be converted into numerical values for use in feature vectors.
Before creating feature vectors, it is crucial to analyse numerical data to detect anomalies and patterns in the data, which aids in identifying potential issues early in the data analysis process.
Outliers, values significantly distant from others, should be identified and handled appropriately.
A dataset probably contains outliers when:
Source: Numerical data: First steps | Machine Learning | Google for Developers
Data normalization is crucial for enhancing machine learning model performance by scaling features to a similar range. It is also recommended to normalise a single numeric feature that covers a wide range (for example, city population).
Normalisation has the following benefits:
| Normalization technique | Formula | When to use |
|---|---|---|
| Linear scaling | $$x'=\frac{x-x_\text{min}}{x_\text{max}-x_\text{min}}$$ | When the feature is mostly uniformly distributed across range; flat-shaped |
| Z-score scaling | $$x' = (x-\mu)/\sigma$$ | When the feature is normally distributed (peak close to mean); bell-shaped |
| Log scaling | $$x'=ln(x)$$ | When the feature distribution is heavy skewed on at least either side of tail; heavy Tail-shaped |
| Clipping | If x > max, set $$x'=max$$ If x < min, set $$x' = min$$ | When the feature contains extreme outliers |
Source: Numerical data: Normalization | Machine Learning | Google for Developers
Binning (bucketing) is a feature engineering technique used to group numerical data into categories (bins). In many cases, this turns numerical data into categorical data.
For example, if a feature X has values ranging from 15 to 425, we can apply binning to represent X as a feature vector divided into specific intervals:
| Bin number | Range | Feature vector |
|---|---|---|
| 1 | 15-34 | [1.0, 0.0, 0.0, 0.0, 0.0] |
| 2 | 35-117 | [0.0, 1.0, 0.0, 0.0, 0.0] |
| 3 | 118-279 | [0.0, 0.0, 1.0, 0.0, 0.0] |
| 4 | 280-392 | [0.0, 0.0, 0.0, 1.0, 0.0] |
| 5 | 393-425 | [0.0, 0.0, 0.0, 0.0, 1.0] |
Even though X is a single column in the dataset, binning causes a model to treat X as five separate features. Therefore, the model learns separate weights for each bin.
Binning offers an alternative to scaling or clipping and is particularly useful for handling outliers and improving model performance on non-linear data.
When to use: Binning works well when features exhibit a “clumpy” distribution, that is, the overall linear relationship between the feature and label is weak or nonexistent, or when feature values are clustered.
Example: Number of shoppers versus temperature. By binning them, the model learns separate weights for each bin.

While creating multiple bins is possible, it is generally recommended to avoid an excessive number, as this can lead to insufficient training examples per bin and increased feature dimensionality.
Quantile bucketing is a specific binning technique that ensures each bin contains a roughly equal number of examples, which can be particularly useful for datasets with skewed distributions.

Source: Numerical data: Binning | Machine Learning | Google for Developers
| Problem category | Example |
|---|---|
| Omitted values | A census taker fails to record a resident's age |
| Duplicate examples | A server uploads the same logs twice |
| Out-of-range feature values | A human accidentally types an extra digit |
| Bad labels | A human evaluator mislabels a picture of an oak tree as a maple |
You can use programs or scripts to identify and handle data issues such as omitted values, duplicates, and out-of-range feature values by removing or correcting them.
Source: Numerical data: Scrubbing | Machine Learning | Google for Developers
Source: Numerical data: Qualities of good numerical features | Machine Learning | Google for Developers
Synthetic features, such as polynomial transforms, enable linear models to represent non-linear relationships by introducing new features based on existing ones.
By incorporating synthetic features, linear regression models can effectively separate data points that are not linearly separable, using curves instead of straight lines. For example, we can separate two classes with y = x^2.

Feature crosses, a related concept for categorical data, synthesise new features by combining existing features, further enhancing model flexibility.
Source: Numerical data: Polynomial transforms | Machine Learning | Google for Developers
Categorical data has a specific set of possible values. Examples include species of animals, names of streets, whether or not an email is spam, and binned numbers.
Categorical data can include numbers that behave like categories. An example is postal codes.
Encoding means converting categorical or other data to numerical vectors that a model can train on.
Preprocessing includes converting non-numerical data, such as strings, to floating-point values.
Source: Working with categorical data | Machine Learning | Google for Developers
Machine learning models require numerical input; therefore, categorical data such as strings must be converted to numerical representations.
The term dimension is a synonym for the number of elements in a feature vector. Some categorical features are low dimensional. For example:
| Feature name | # of categories | Sample categories |
|---|---|---|
| snowed_today | 2 | True, False |
| skill_level | 3 | Beginner, Practitioner, Expert |
| season | 4 | Winter, Spring, Summer, Autumn |
| dayofweek | 7 | Monday, Tuesday, Wednesday |
| planet | 8 | Mercury, Venus, Earth |
| car_colour | 8 | Red, Orange, Blue, Yellow |
When a categorical feature has a low number of possible categories, you can encode it as a vocabulary. This treats each category as a separate feature, allowing the model to learn distinct weights for each during training.
One-hot encoding transforms categorical values into numerical vectors (arrays) of N elements, where N is the number of categories. Exactly one of the elements in a one-hot vector has the value 1.0; all the remaining elements have the value 0.0.
| Feature | Red | Orange | Blue | Yellow | Green | Black | Purple | Brown |
|---|---|---|---|---|---|---|---|---|
| “Red” | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| “Orange” | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 0 |
| “Blue” | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |
| “Yellow” | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| “Green” | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 |
| “Black” | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| “Purple” | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 |
| “Brown” | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
It is the one-hot vector, not the string or the index number, that gets passed to the feature vector. The model learns a separate weight for each element of the feature vector.
The end-to-end process to map categories to feature vectors:

In a true one-hot encoding, only one element has the value 1.0. In a variant known as multi-hot encoding, multiple values can be 1.0.
A feature whose values are predominantly zero (or empty) is termed a sparse feature.
Sparse representation efficiently stores one-hot encoded data by only recording the position of the '1' value to reduce memory usage.
Notice that the sparse representation consumes far less memory. Importantly, the model must train on the one-hot vector, not the sparse representation.
The sparse representation of a multi-hot encoding stores the positions of all the non-zero elements. For example, the sparse representation of a car that is both “Blue” and “Black” is 2, 5.
Categorical features can have outliers. If “car_colour” includes rare values such as “Mauve” or “Avocado”, you can group them into one out-of-vocabulary (OOV) category. All rare colours go into this single bucket, and the model learns one weight for it.
For high-dimensional categorical features with many categories, one-hot encoding might be inefficient, and embeddings or hashing (also called the hashing trick) are recommended.
Source: Categorical data: Vocabulary and one-hot encoding | Machine Learning | Google for Developers
Categorical data quality hinges on how categories are defined and labelled, impacting data reliability.
Human-labelled data, known as “gold labels”, is generally preferred for training due to its higher quality, but it is essential to check for human errors and biases.
Machine-labelled data, or “silver labels”, can introduce biases or inaccuracies, necessitating careful quality checks and awareness of potential common-sense violations.
High dimensionality in categorical data increases training complexity and costs, leading to techniques such as embeddings for dimensionality reduction.
Source: Categorical data: Common issues | Machine Learning | Google for Developers
Feature crosses are created by combining two or more categorical or bucketed features to capture interactions and non-linearities within a dataset.
For example, consider a leaf dataset with the categorical features:
The feature cross, or Cartesian product, of these two features would be:
{Smooth_Opposite, Smooth_Alternate, Toothed_Opposite, Toothed_Alternate, Lobed_Opposite, Lobed_Alternate}
For example, if a leaf has a lobed edge and an alternate arrangement, the feature-cross vector will have a value of 1 for “Lobed_Alternate”, and a value of 0 for all other terms:
{0, 0, 0, 0, 0, 1}
This dataset could be used to classify leaves by tree species, since these characteristics do not vary within a species.
Feature crosses are somewhat analogous to polynomial transforms.
Feature crosses can be particularly effective when guided by domain expertise. It is often possible, though computationally expensive, to use neural networks to automatically find and apply useful feature combinations during training.
Overuse of feature crosses with sparse features should be avoided, as it can lead to excessive sparsity in the resulting feature set. For example, if feature A is a 100-element sparse feature and feature B is a 200-element sparse feature, a feature cross of A and B yields a 20,000-element sparse feature.
Source: Categorical data: Feature crosses | Machine Learning | Google for Developers
Source: Datasets, generalization, and overfitting | Machine Learning | Google for Developers
A machine learning model's performance is heavily reliant on the quality and quantity of the dataset it is trained on, with larger, high-quality datasets generally leading to better results.
Datasets can contain various data types, including numerical, categorical, text, multimedia, and embedding vectors, each requiring specific handling for optimal model training.
The following are common causes of unreliable data in datasets:
Maintaining data quality involves addressing issues such as label errors, noisy features, and proper filtering to ensure the reliability of the dataset for accurate predictions.
Incomplete examples with missing feature values should be handled by either deletion or imputation to avoid negatively impacting model training.
When imputing missing values, use reliable methods such as mean/median imputation and consider adding an indicator column to signal imputed values to the model. For example, alongside temperature include “temperature_is_imputed”. This lets the model learn to trust real observations more than imputed ones.
Source: Datasets: Data characteristics | Machine Learning | Google for Developers
Direct labels are generally preferred but often unavailable.
Use a proxy label when no direct label exists or when the direct concept resists easy numeric representation. Carefully evaluate proxy labels to ensure they are a suitable approximation.
Human-generated labels, while offering flexibility and nuanced understanding, can be expensive to produce and prone to errors, requiring careful quality control.
Models can train on a mix of automated and human-generated labels, but an extra set of human labels often adds complexity without sufficient benefit.
Source: Datasets: Labels | Machine Learning | Google for Developers
Imbalanced datasets occur when one label (majority class) is significantly more frequent than another (minority class), potentially hindering model training on the minority class.
Note: Accuracy is usually a poor metric for assessing a model trained on a class-imbalanced dataset.
A highly imbalanced floral dataset containing far more sunflowers (200) than roses (2):

During training, a model should learn two things:
Standard training conflates these two goals. In contrast, a two-step technique of downsampling and upweighting the majority class separates these two goals, enabling the model to achieve both.
Step 1: Downsample the majority class by training on only a small fraction of majority class examples, which makes an imbalanced dataset more balanced during training and increases the chance that each batch contains enough minority examples.
For example, with a class-imbalanced dataset consisting of 99% majority class and 1% minority class examples, we could downsample the majority class by a factor of 25 to create a more balanced training set (80% majority class and 20% minority class).
Downsampling the majority class by a factor of 25:

Step 2: Upweight the downsampled majority class by the same factor used for downsampling, so each majority class error counts proportionally more during training. This corrects the artificial class distribution and bias introduced by downsampling, because the training data no longer reflects real-world frequencies.
Continuing the example from above, we must upweight the majority class by a factor of 25. That is, when the model mistakenly predicts the majority class, treat the loss as if it were 25 errors (multiply the regular loss by 25).
Upweighting the majority class by a factor of 25:

Experiment with different downsampling and upweighting factors just as you would experiment with other hyperparameters.
Benefits of this technique include a better model (the resultant model knows what each class looks like and how common each class is) and faster convergence.
Source: Datasets: Class-imbalanced datasets | Machine Learning | Google for Developers
Machine learning models should be tested against unseen data.
It is recommended to split the dataset into three subsets: training, validation, and test sets.

The validation set is used for initial testing during training (to determine hyperparameter tweaks, add, remove, or transform features, and so on), and the test set is used for final evaluation.

The validation and test sets can “wear out” with repeated use. For this reason, it is a good idea to collect more data to “refresh” the test and validation sets.
A good test set is:
In theory, the validation set and test set should contain the same number of examples, or nearly so.
Source: Datasets: Dividing the original dataset | Machine Learning | Google for Developers
Machine learning models require all data, including features such as street names, to be transformed into numerical (floating-point) representations for training.
Normalisation improves model training by converting existing floating-point features to a constrained range.
When dealing with large datasets, select a subset of examples for training. When possible, select the subset that is most relevant to your model's predictions. Safeguard privacy by omitting examples containing personally identifiable information.
Source: Datasets: Transforming data | Machine Learning | Google for Developers
Generalisation refers to a model's ability to perform well on new, unseen data.
Source: Generalization | Machine Learning | Google for Developers
Overfitting means creating a model that matches the training set so closely that the model fails to make correct predictions on new data.
Generalization is the opposite of overfitting. That is, a model that generalises well makes good predictions on new data.
An overfit model is analogous to an invention that performs well in the lab but is worthless in the real world. An underfit model is like a product that does not even do well in the lab.
Overfitting can be detected by observing diverging loss curves for training and validation sets on a generalization curve (a graph that shows two or more loss curves). A generalization curve for a well-fit model shows two loss curves that have similar shapes.
Common causes of overfitting include:
Dataset conditions for good generalization include:
Source: Overfitting | Machine Learning | Google for Developers
Simpler models often generalise better to new data than complex models, even if they perform slightly worse on training data.
Occam's Razor favours simpler explanations and models.
Model training should minimise both loss and complexity for optimal performance on new data. $$ \text{minimise}(\text{loss + complexity}) $$
Unfortunately, loss and complexity are typically inversely related. As complexity increases, loss decreases. As complexity decreases, loss increases.
Regularisation techniques help prevent overfitting by penalising model complexity during training.
Source: Overfitting: Model complexity | Machine Learning | Google for Developers
L2 regularisation is a popular regularisation metric to reduce model complexity and prevent overfitting. It uses the following formula: $$ L_2 \text{ regularisation} = w^2_1 + w^2_2 + \ldots + w^2_n $$
It penalises especially large weights.
L2 regularisation encourages weights towards 0, but never pushes them all the way to zero.
A regularisation rate (lambda) controls the strength of regularisation. $$ \text{minimise}(\text{loss} + \lambda \text{ complexity}) $$
Tuning is required to find the ideal regularisation rate.
Early stopping is an alternative regularisation method that involves ending training before the model fully converges to prevent overfitting. It usually increases training loss but decreases test loss. It is a quick but rarely optimal form of regularisation.
Learning rate and regularisation rate tend to pull weights in opposite directions. A high learning rate often pulls weights away from zero, while a high regularisation rate pulls weights towards zero. The goal is to find the equilibrium.
Source: Overfitting: L2 regularization | Machine Learning | Google for Developers
An ideal loss curve looks like this:

To improve an oscillating loss curve:

Possible reasons for a loss curve with a sharp jump include:

Test loss diverges from training loss when:

The loss curve gets stuck when:

Source: Overfitting: Interpreting loss curves | Machine Learning | Google for Developers
from Stefan Angrick
This post is part of a four-part summary of Google's Machine Learning Crash Course. For context, check out this post.
The linear regression model uses an equation $$ y' = b + w_1x_1 + w_2x_2 + \ldots $$ to represent the relationship between features and the label.
y and features x are given. b and w are calculated from training by minimizing the difference between predicted and actual values.
Source: Linear regression | Machine Learning | Google for Developers
Loss is a numerical value indicating the difference between a model's predictions and the actual values.
The goal of model training is to minimize loss, bringing it as close to zero as possible.
| Loss type | Definition | Equation |
|---|---|---|
| L1 loss | The sum of the absolute values of the difference between the predicted values and the actual values. | $$\sum |\text{actual value}-\text{predicted value}|$$ |
| Mean absolute error (MAE) | The average of L1 losses across a set of N examples. | $$\frac{1}{N}\sum |\text{actual value}-\text{predicted value}|$$ |
| L2 loss | The sum of the squared difference between the predicted values and the actual values. | $$\sum (\text{actual value}-\text{predicted value})^2$$ |
| Mean squared error (MSE) | The average of L2 losses across a set of N examples. | $$\frac{1}{N}\sum (\text{actual value}-\text{predicted value})^2$$ |
The most common methods for calculating loss are Mean Absolute Error (MAE) and Mean Squared Error (MSE), which differ in their sensitivity to outliers.
A model trained with MSE moves the model closer to the outliers but further away from most of the other data points.

A model trained with MAE is farther from the outliers but closer to most of the other data points.

Source: Linear regression: Loss | Machine Learning | Google for Developers
Gradient descent is an iterative optimisation algorithm used to find the best weights and bias for a linear regression model by minimising the loss function.
A model is considered to have converged when further iterations do not significantly reduce the loss, indicating it has found the weights and bias that produce the lowest possible loss.
Loss curves visually represent the model's progress during training, showing how the loss decreases over iterations and helping to identify convergence.
Linear models have convex loss functions, ensuring that gradient descent will always find the global minimum, resulting in the best possible model for the given data.
Source: Linear regression: Gradient descent | Google for Developers
Hyperparameters, such as learning rate, batch size, and epochs, are external configurations that influence the training process of a machine learning model.
The learning rate determines the step size during gradient descent, impacting the speed and stability of convergence.
Batch size dictates the number of training examples processed before updating model parameters, influencing training speed and noise.
Model trained with SGD:

Model trained with mini-batch SGD:

Epochs represent the number of times the entire training dataset is used during training, affecting model performance and training time.
Source: Linear regression: Hyperparameters | Machine Learning | Google for Developers
Logistic regression is a model used to predict the probability of an outcome, unlike linear regression which predicts continuous numerical values.
Logistic regression models output probabilities, which can be used directly or converted to binary categories.
Source: Logistic Regression | Machine Learning | Google for Developers
A logistic regression model uses a linear equation and the sigmoid function to calculate the probability of an event.
The sigmoid function ensures the output of logistic regression is always between 0 and 1, representing a probability.
$$
f(x) = \frac{1}{1 + e^{-x}}
$$

Linear component of a logistic regression model: $$ z = b + w_1 x_1 + w_2 x_2 + \ldots + w_N x_N $$ To obtain the logistic regression prediction, the z value is then passed to the sigmoid function, yielding a value (a probability) between 0 and 1: $$ y' = \frac{1}{1+e^{-z}} $$
z is referred to as the log-odds because if you solve the sigmoid function for z you get: $$ z = \log(\frac{y}{1-y}) $$ This is the log of the ratio of the probabilities of the two possible outcomes: y and 1 – y.
When the linear equation becomes input to the sigmoid function, it bends the straight line into an s-shape.

Logistic regression models are trained similarly to linear regression models but use Log Loss instead of squared loss and require regularisation.
Log Loss is used in logistic regression because the rate of change isn't constant, requiring varying precision levels unlike squared loss used in linear regression.
The Log Loss equation returns the logarithm of the magnitude of the change, rather than just the distance from data to prediction. Log Loss is calculated as follows: $$ \text{Log Loss} = \sum_{(x,y)\in D} -y\log(y') – (1 – y)\log(1 – y') $$
Regularisation, such as L2 regularisation or early stopping, is crucial in logistic regression to prevent overfitting (due to the model's asymptotic nature) and improve generalisation.
Source: Logistic regression: Loss and regularization | Machine Learning | Google for Developers
Logistic regression models can be converted into binary classification models for predicting categories instead of probabilities.
Source: Classification | Machine Learning | Google for Developers
To convert the raw output from a logistic regression model into binary classification (positive and negative class), you need a classification threshold.
Confusion matrix
| Actual positive | Actual negative | |
|---|---|---|
| Predicted positive | True positive (TP) | False positive (FP) |
| Predicted negative | False negative (FN) | True negative (TN) |
Total of each row = all predicted positives (TP + FP) and all predicted negatives (FN + TN) Total of each column = all real positives (TP + FN) and all real negatives (FP + TN)
When we increase the classification threshold, both TP and FP decrease, and both TN and FN increase.
Source: Thresholds and the confusion matrix | Machine Learning | Google for Developers
Accuracy, Recall, Precision, and related metrics are all calculated at a single classification threshold value.
Accuracy is the proportion of all classifications that were correct. $$ \text{Accuracy} = \frac{\text{correct classifications}}{\text{total classifications}} = \frac{TP+TN}{TP+TN+FP+FN} $$
Recall, or true positive rate, is the proportion of all actual positives that were classified correctly as positives. Also known as probability of detection. $$ \text{Recall (or TPR)} = \frac{\text{correctly classified actual positives}}{\text{all actual positives}} = \frac{TP}{TP+FN} $$
False positive rate is the proportion of all actual negatives that were classified incorrectly as positives. Also known as probability of a false alarm. $$ \text{FPR} = \frac{\text{incorrectly classified actual negatives}}{\text{all actual negatives}}=\frac{FP}{FP+TN} $$
Precision is the proportion of all the model's positive classifications that are actually positive. $$ \text{Precision} = \frac{\text{correctly classified actual positives}}{\text{everything classified as positive}}=\frac{TP}{TP+FP} $$
Precision and Recall often show an inverse relationship.
F1 score is the harmonic mean of Precision and Recall. $$ \text{F1} = 2 * \frac{\text{precision} * \text{recall}}{\text{precision} + \text{recall}} = \frac{2TP}{2TP + FP + FN} $$
ROC and AUC evaluate a model's quality across all possible thresholds.
ROC curve, or receiver operating characteristic curves, plot the true positive rate (TPR) against the false positive rate (FPR) at different thresholds. A perfect model would pass through (0,1), while a random guesser forms a diagonal line from (0,0) to (1,1).
AUC, or area under the curve, represents the probability that the model will rank a randomly chosen positive example higher than a negative example. A perfect model has AUC = 1.0, while a random model has AUC = 0.5.
ROC and AUC of a hypothetical perfect model (AUC = 1.0) and for completely random guesses (AUC = 0.5):


ROC and AUC are effective when class distributions are balanced. For imbalanced data, precision-recall curves (PRCs) can be more informative.

A higher AUC generally indicates a better-performing model.
ROC and AUC of two hypothetical models; the first curve (AUC = 0.65) represents the better of the two models:

Threshold choice depends on the cost of false positives versus false negatives. The most relevant thresholds are those closest to (0,1) on the ROC curve. For costly false positives, a conservative threshold (like A in the chart below) is better. For costly false negatives, a more sensitive threshold (like C) is preferable. If costs are roughly equivalent, a threshold in the middle (like B) may be best.

Source: Classification: ROC and AUC | Machine Learning | Google for Developers
Prediction bias measures the difference between the average of a model's predictions and the average of the true labels in the data. For example, if 5% of emails in the dataset are spam, a model without prediction bias should also predict about 5% as spam. A large mismatch between these averages indicates potential problems.
Prediction bias can be caused by:
Source: Classification: Prediction bias | Machine Learning | Google for Developers
Multi-class classification extends binary classification to cases with more than two classes.
If each example belongs to only one class, the problem can be broken down into a series of binary classifications. For instance, with three classes (A, B, C), you could first separate C from A+B, then distinguish A from B within the A+B group.
Source: Classification: Multi-class classification | Machine Learning | Google for Developers
from
Bloc de notas
no sé si te acuerdas o vas tan rápido que a estas alturas te da igual lo que pasó / lo que se fue se fue y entonces sí que aprendiste algo de mí / un poco a vivir
from Stefan Angrick
I like to revisit Google's Machine Learning Crash Course every now and then to refresh my understanding of key machine learning concepts. It's a fantastic free resource, published under the Creative Commons Attribution 4.0 License, now updated with content on recent developments like large language models and automated ML.
I take notes as a personal reference, and I thought I would post them here to keep track of what I've learned—while hopefully offering something useful to others doing the same.
The notes are organised into four posts, one for each course module:
from
Context[o]s
© Foto Ricard Ramon. Collage digital compuesto por una fotografía analógica (Leica R y Kodak TriX) y una foto libre de derechos de la Institución Smithsonian.
Amigas y amigos que os molestáis en leer estas líneas, se acerca un año nuevo lleno de incertidumbre y que se prevé más complicado para el futuro de la humanidad que el anterior. Todos los indicadores y la realidad evidente nos muestran un escenario de confusión, de mentira constante, de amenazas de gobiernos fascistas en oriente y occidente, de ascenso de los fascistas en este mismo trozo de tierra que habitamos, cuya memoria trae frescos y terribles recuerdos de los hechos criminales y salvajes que siempre cometen esos mismos fascistas. Supuestos demócratas, abriéndole las puertas al demonio ante nuestras narices y con desvergonzada alegría.
El escenario es desalentador, con una crisis climática desbocada e imparable de consecuencias imprevisibles, con gobiernos formados por pandas de lunáticos a los que hemos otorgado poder por la gracia de Meta, X y Google, que nos van dictando con sus algoritmos lo que tenemos que pensar, o peor aún, nos generan la suficiente confusión y sentimiento de derrota como para dejar de creer en nada. A la que se une la basura infecta de la IA generativa de OpenAI y derivados y sus megacentros de datos extractivos de agua y energía, generadores de pobreza y miseria en zonas rurales ya pobres y castigadas de inicio.
Los resultados electorales en todo el mundo parecen ir demostrando una sola cosa: la pérdida de esperanza y la fe en un mundo mejor. Algunas personas se están lanzando a votar hacia aquellos partidos que solo pueden prometer una cosa, la aceleración de nuestra autodestrucción colectiva, dado que básicamente eso es el fascismo. Un sistema de opresión de libertades y cuya novedad en su versión contemporánea es su pleitesía y obediencia a los grandes magnates de la tecnología (curiosamente extranjeros, ellos que se dicen tan nacionales en sus diferentes versiones de colorines).
Una desesperanza que contagia incluso a los esperanzados y optimistas, que permanecen prisioneros de sus redes, nunca mejor dicho, y se niegan a abandonar un mar embravecido en el que manipulan y destruyen su esperanza cada día. Un mar dominado por los algoritmos del poder que ahonda en la miseria de la humanidad. Como adictos, el primer paso es reconocer la adicción; el segundo, imprescindible, es abandonar con urgencia las redes del algoritmo del poder, eliminar para siempre X, Facebook, Instagram, TikTok, etc. y salir corriendo sin mirar atrás. Hay esperanza, más allá de los dictados del algoritmo y sus millonarios dueños, en las redes libres, sin dueño, federadas, sin algoritmo, cuya propiedad reside en las personas que las componen y donde no es posible generar adicción ni viralidad sin algoritmos.
Se necesitan impulsos políticos que no nazcan desde la rabia o la reacción que estas redes alimentan (de nuevo el algoritmo de los ricos para oprimir y manipular a los pobres). Que emerjan desde la proposición optimista de cómo se deben hacer las cosas hacia el bien. No como forma de enfrentarse al mal que nos acecha, sino porque solo se puede concebir el mundo hacia el camino del bien, de la búsqueda de la verdad, de la creación de escenarios de comunidad y pensamiento verdaderamente empático.
Hay que utilizar los medios creativos para estimular la posibilidad de imaginar nuevos mundos posibles y no para recordar de forma permanente las amenazas que se ciernen sobre nosotros. Las evidencias y el diagnóstico solo sirven para eso, para proponer y actuar, no para quedarse en la reacción y la queja perpetua hacia el otro. Esto supone una pérdida de tiempo y energía que debe conducirse hacia la restitución del bien y la esperanza, hacia la imaginación y el ejercicio activo de mundos posibles. Ejemplos de ello y personas que están en ese camino no nos faltan.
Quizá hay que proponer más utopías en el cine y en el arte, y menos distopías, que parecen ser las únicas que inundan las series y propuestas cinematográficas y narrativas. Hay que pintar, escribir, danzar, proyectándose hacia un futuro imaginado mejor, porque simplemente se vive mejor en los escenarios de ese futuro, que para que exista, hay que crearlo, inventarlo, imaginarlo y proyectarlo colectivamente. Y hay que visibilizar aquello que se está haciendo sin miedo y con alegría. Hay que poner de moda de nuevo hacer el bien, creer en los demás y en un futuro colectivo que sienta unas bases mínimas de esperanza en común. Solo existe futuro en lo común; esto es una evidencia empírica incontestable. Una verdad innegable, en tiempos donde nos quieren hacer creer que no existe la verdad.
Vivimos atenazados por el miedo y nos movemos por reacción, y es necesario empezar a moverse por acción. Los actos, el acto creativo, artístico, imaginativo, es el único que nos puede salvar de caer en la desesperanza o el nihilismo. El activismo no puede ser siempre reactivo, porque entonces jugamos prisioneros del campo contrario y sabemos que ahí no se puede ganar, como estamos viendo cada día. Tenemos la razón, la búsqueda de la verdad, la justicia, los derechos humanos, los derechos animales y ambientales, la ciencia, la ciencia del arte, incluso el sentido común al que tanto alegan algunos fascistas, reside en la razón y el optimismo.
Sí, ciertamente, parte de la cultura se ha vuelto hostil, y emerge una subcultura de la sinrazón y la mentira; eso lo sabemos. Tampoco es nada nuevo ni que nos debiera sorprender especialmente; siempre ha estado ahí en otras formas, solo hay que revisar la historia. Pero la oscuridad no se combate con más oscuridad, ni con exceso de luz. Nuestra obligación es crear el contraste entre la luz y la sombra, que es donde emerge el color, como demostró acertadamente Goethe; vamos a ello.
#sociedad #internet #fediverso #política
from ‡
contradictions
i am full of them at times.
logic dominates my perception,
fully aligned with who i truly am.
yet a sense of doubt can still set me
back, and thinking becomes dominated
by the heart.
such is the human condition,
which i find difficult to accept within me,
wanting to be holy while human.
this has been my biggest challenge,
accepting that condition means coexisting
with this body and time.
asking others for reassurance,
knowing well i do not truly need it.
the body has ways of acting
i do not always expect,
learned behaviors shaped by a wounded ego.
for a moment i forget i am human
and turn my anger inward,
but somewhere i remember
i am not too much:
i am simply human.
i find extreme beauty in this so many layers a human has, no flat lines. and still at times feel shame in it = contradiction
from
SmarterArticles

The game changed in May 2025 when Anthropic released Claude 4 Opus and Sonnet, just three months after Google had stunned the industry with Gemini 2.5's record-breaking benchmarks. Within a week, Anthropic's new models topped those same benchmarks. Two months later, OpenAI countered with GPT-5. By September, Claude Sonnet 4.5 arrived. The pace had become relentless.
This isn't just competition. It's an arms race that's fundamentally altering the economics of building on artificial intelligence. For startups betting their futures on specific model capabilities, and enterprises investing millions in AI integration, the ground keeps shifting beneath their feet. According to MIT's “The GenAI Divide: State of AI in Business 2025” report, whilst generative AI holds immense promise, about 95% of AI pilot programmes fail to achieve rapid revenue acceleration, with the vast majority stalling and delivering little to no measurable impact on profit and loss statements.
The frequency of model releases has accelerated to a degree that seemed impossible just two years ago. Where annual or semi-annual updates were once the norm, major vendors now ship significant improvements monthly, sometimes weekly. This velocity creates a peculiar paradox: the technology gets better faster than organisations can adapt to previous versions.
The numbers tell a striking story. Anthropic alone shipped seven major model versions in 2025, starting with Claude 3.7 Sonnet in February, followed by Claude 4 Opus and Sonnet in May, Claude Opus 4.1 in August, and culminating with Claude Sonnet 4.5 in September and Claude Haiku 4.5 in October. OpenAI maintained a similarly aggressive pace, releasing GPT-4.5 and its landmark GPT-5 in August, alongside o3 pro (an enhanced reasoning model), Codex (an autonomous code agent), and the gpt-oss family of open-source models.
Google joined the fray with Gemini 3, which topped industry benchmarks and earned widespread praise from researchers and developers across social platforms. The company simultaneously released Veo 3, a video generation model capable of synchronised 4K video with natural audio integration, and Imagen 4, an advanced image synthesis system.
The competitive dynamics are extraordinary. More than 800 million people use ChatGPT each week, yet OpenAI faces increasingly stiff competition from rivals who are matching or exceeding its capabilities in specific domains. When Google released Gemini 3, it set new records on numerous benchmarks. The following week, Anthropic's Claude Opus 4.5 achieved even higher scores on some of the same evaluations.
This leapfrogging pattern has become the industry's heartbeat. Each vendor's release immediately becomes the target for competitors to surpass. The cycle accelerates because falling behind, even briefly, carries existential risks when customers can switch providers with relative ease.
For startups building on these foundation models, rapid releases create a sophisticated risk calculus. Every API update or model deprecation forces developers to confront rising switching costs, inconsistent documentation, and growing concerns about vendor lock-in.
The challenge is particularly acute because opportunities to innovate with AI exist everywhere, yet every niche has become intensely competitive. As one venture analysis noted, whilst innovation potential is ubiquitous, what's most notable is the fierce competition in every sector going after the same customer base. For customers, this drives down costs and increases choice. For startups, however, customer acquisition costs continue rising whilst margins erode.
The funding landscape reflects this pressure. AI companies now command 53% of all global venture capital invested in the first half of 2025. Despite unprecedented funding levels exceeding $100 billion, 81% of AI startups will fail within three years. The concentration of capital in mega-rounds means early-stage founders face increased competition for attention and investment. Geographic disparities persist sharply: US companies received 71% of global funding in Q1 2025, with Bay Area startups alone capturing 49% of worldwide venture capital.
Beyond capital, startups grapple with infrastructure constraints that large vendors navigate more easily. Training and running AI models requires computing power that the world's chip manufacturers and cloud providers struggle to supply. Startups often queue for chip access or must convince cloud providers that their projects merit precious GPU allocation. The 2024 State of AI Infrastructure Report painted a stark picture: 82% of organisations experienced AI performance issues.
Talent scarcity compounds these challenges. The demand for AI expertise has exploded whilst supply of qualified professionals hasn't kept pace. Established technology giants actively poach top talent, creating fierce competition for the best engineers and researchers. This “AI Execution Gap” between C-suite ambition and organisational capacity to execute represents a primary reason for high AI project failure rates.
Yet some encouraging trends have emerged. With training costs dramatically reduced through algorithmic and architectural innovations, smaller companies can compete with established leaders, spurring a more dynamic and diverse market. Over 50% of foundation models are now available openly, meaning startups can download state-of-the-art models and build upon them rather than investing millions in training from scratch.
The rapid release cycle creates particularly thorny problems around model deprecation. OpenAI's approach illustrates the challenge. The company uses “sunset” and “shut down” interchangeably to indicate when models or endpoints become inaccessible, whilst “legacy” refers to versions that no longer receive updates.
In 2024, OpenAI announced that access to the v1 beta of its Assistants API would shut down by year's end when releasing v2. Access discontinued on 18 December 2024. On 29 August 2024, developers learned that fine-tuning babbage-002 and davinci-002 models would no longer support new training runs starting 28 October 2024. By June 2024, only existing users could continue accessing gpt-4-32k and gpt-4-vision-preview.
The 2025 deprecation timeline proved even more aggressive. GPT-4.5-preview was removed from the API on 14 July 2025. Access to o1-preview ended 28 July 2025, whilst o1-mini survived until 27 October 2025. In November 2025 alone, OpenAI deprecated the chatgpt-4o-latest model snapshot (removal scheduled for 17 February 2026), codex-mini-latest (removed 16 January 2026), and DALL·E model snapshots (removal set for 12 May 2026).
For enterprises, this creates genuine operational risk. Whilst OpenAI indicated that API deprecations for business customers receive significant advance notice (typically three months), the sheer frequency of changes forces constant adaptation. Interestingly, OpenAI told VentureBeat that it has no plans to deprecate older models on the API side, stating “In the API, we do not currently plan to deprecate older models.” However, ChatGPT users experienced more aggressive deprecation, with subscribers on the ChatGPT Enterprise tier retaining access to all models whilst individual users lost access to popular versions.
Azure OpenAI's policies attempt to provide more stability. Generally Available model versions remain accessible for a minimum of 12 months. After that period, existing customers can continue using older versions for an additional six months, though new customers cannot access them. Preview models have much shorter lifespans: retirement occurs 90 to 120 days from launch. Azure provides at least 60 days' notice before retiring GA models and 30 days before preview model version upgrades.
These policies reflect a fundamental tension. Vendors need to maintain older models whilst advancing rapidly, but supporting numerous versions simultaneously creates technical debt and resource strain. Enterprises, meanwhile, need stability to justify integration investments that can run into millions of pounds.
According to nearly 60% of AI leaders surveyed, their organisations' primary challenges in adopting agentic AI are integrating with legacy systems and addressing risk and compliance concerns. Agentic AI thrives in dynamic, connected environments, but many enterprises rely on rigid legacy infrastructure that makes it difficult for autonomous AI agents to integrate, adapt, and orchestrate processes. Overcoming this requires platform modernisation, API-driven integration, and process re-engineering.
Successful organisations have developed sophisticated strategies for navigating this turbulent landscape. The most effective approach treats AI implementation as business transformation rather than technology deployment. Organisations achieving 20% to 30% return on investment focus on specific business outcomes, invest heavily in change management, and implement structured measurement frameworks.
A recommended phased approach introduces AI gradually, running AI models alongside traditional risk assessments to compare results, build confidence, and refine processes before full adoption. Real-time monitoring, human oversight, and ongoing model adjustments keep AI risk management sharp and reliable. The first step involves launching comprehensive assessments to identify potential vulnerabilities across each business unit. Leaders then establish robust governance structures, implement real-time monitoring and control mechanisms, and ensure continuous training and adherence to regulatory requirements.
At the organisational level, enterprises face the challenge of fine-tuning vendor-independent models that align with their own governance and risk frameworks. This often requires retraining on proprietary or domain-specific data and continuously updating models to reflect new standards and business priorities. With players like Mistral, Hugging Face, and Aleph Alpha gaining traction, enterprises can now build model strategies that are regionally attuned and risk-aligned, reducing dependence on US-based vendors.
MIT's Center for Information Systems Research identified four critical challenges enterprises must address to move from piloting to scaling AI: Strategy (aligning AI investments with strategic goals), Systems (architecting modular, interoperable platforms), Synchronisation (creating AI-ready people, roles, and teams), and Stewardship (embedding compliant, human-centred, and transparent AI practices).
How companies adopt AI proves crucial. Purchasing AI tools from specialised vendors and building partnerships succeed about 67% of the time, whilst internal builds succeed only one-third as often. This suggests that expertise and pre-built integration capabilities outweigh the control benefits of internal development for most organisations.
Agile practices enable iterative development and quick adaptation. AI models should grow with business needs, requiring regular updates, testing, and improvements. Many organisations cite worries about data confidentiality and regulatory compliance as top enterprise AI adoption challenges. By 2025, regulations like GDPR, CCPA, HIPAA, and similar data protection laws have become stricter and more globally enforced. Financial institutions face unique regulatory requirements that shape AI implementation strategies, with compliance frameworks needing to be embedded throughout the AI lifecycle rather than added as afterthoughts.
One of the most effective risk mitigation strategies involves implementing an abstraction layer between applications and AI providers. A unified API for AI models provides a single, standardised interface allowing developers to access and interact with multiple underlying models from different providers. It acts as an abstraction layer, simplifying integration of diverse AI capabilities by providing a consistent way to make requests regardless of the specific model or vendor.
This approach abstracts away provider differences, offering a single, consistent interface that reduces development time, simplifies code maintenance, and allows easier switching or combining of models without extensive refactoring. The strategy reduces vendor lock-in and keeps applications shipping even when one provider rate-limits or changes policies.
According to Gartner's Hype Cycle for Generative AI 2025, AI gateways have emerged as critical infrastructure components, no longer optional but essential for scaling AI responsibly. By 2025, expectations from gateways have expanded beyond basic routing to include agent orchestration, Model Context Protocol compatibility, and advanced cost governance capabilities that transform gateways from routing layers into long-term platforms.
Key features of modern AI gateways include model abstraction (hiding specific API calls and data formats of individual providers), intelligent routing (automatically directing requests to the most suitable or cost-effective model based on predefined rules or real-time performance), fallback mechanisms (ensuring service continuity by automatically switching to alternative models if primary models fail), and centralised management (offering a single dashboard or control plane for managing API keys, usage, and billing across multiple services).
Several solutions have emerged to address these needs. LiteLLM is an open-source gateway supporting over 100 models, offering a unified API and broad compatibility with frameworks like LangChain. Bifrost, designed for enterprise-scale deployment, offers unified access to over 12 providers (including OpenAI, Anthropic, AWS Bedrock, and Google Vertex) via a single OpenAI-compatible API, with automatic failover, load balancing, semantic caching, and deep observability integrations.
OpenRouter provides a unified endpoint for hundreds of AI models, emphasising user-friendly setup and passthrough billing, well-suited for rapid prototyping and experimentation. Microsoft.Extensions.AI offers a set of core .NET libraries developed in collaboration across the .NET ecosystem, providing a unified layer of C# abstractions for interacting with AI services. The Vercel AI SDK provides a standardised approach to interacting with language models through a specification that abstracts differences between providers, allowing developers to switch between providers whilst using the same API.
Best practices for avoiding vendor lock-in include coding against OpenAI-compatible endpoints, keeping prompts decoupled from code, using a gateway with portable routing rules, and maintaining a model compatibility matrix for provider-specific quirks. The foundation of any multi-model system is this unified API layer. Instead of writing separate code for OpenAI, Claude, Gemini, or LLaMA, organisations build one internal method (such as generate_response()) that handles any model type behind the scenes, simplifying logic and future-proofing applications against API changes.
Whilst rapid release cycles create integration challenges, they've also unlocked powerful new capabilities, particularly in multimodal AI systems that process text, images, audio, and video simultaneously. According to Global Market Insights, the multimodal AI market was valued at $1.6 billion in 2024 and is projected to grow at a remarkable 32.7% compound annual growth rate through 2034. Gartner research predicts that 40% of generative AI solutions will be multimodal by 2027, up from just 1% in 2023.
The technology represents a fundamental shift. Multimodal AI refers to artificial intelligence systems that can process, understand, and generate multiple types of data (text, images, audio, video, and more) often simultaneously. By 2025, multimodal AI reached mass adoption, transforming from experimental capability to essential infrastructure.
GPT-4o exemplifies this evolution. ChatGPT's general-purpose flagship as of mid-2025, GPT-4o is a unified multimodal model that integrates all media formats into a singular platform. It handles real conversations with 320-millisecond response times, fast enough that users don't notice delays. The model processes text, images, and audio without separate preprocessing steps, creating seamless interactions.
Google's Gemini series was designed for native multimodality from inception, processing text, images, audio, code, and video. The latest Gemini 2.5 Pro Preview, released in May 2025, excels in coding and building interactive web applications. Gemini's long context window (up to 1 million tokens) allows it to handle vast datasets, enabling entirely new use cases like analysing complete codebases or processing comprehensive medical histories.
Claude has evolved into a highly capable multimodal assistant, particularly for knowledge workers dealing with documents and images regularly. Whilst it doesn't integrate image generation, it excels when analysing visual content in context, making it valuable for professionals processing mixed-media information.
Even mobile devices now run sophisticated multimodal models. Phi-4, at 5.6 billion parameters, fits in mobile memory whilst handling text, image, and audio inputs. It's designed for multilingual and hybrid use with actual on-device processing, enabling applications that don't depend on internet connectivity or external servers.
The technical architecture behind these systems employs three main fusion techniques. Early fusion combines raw data from different modalities at the input stage. Intermediate fusion processes and preserves modality-specific features before combining them. Late fusion analyses streams separately and merges outputs from each modality. Images are converted to 576 to 3,000 tokens depending on resolution. Audio becomes spectrograms converted to audio tokens. Video becomes frames transformed into image tokens plus temporal tokens.
The breakthroughs of 2025 happened because of leaps in computation and chip design. NVIDIA Blackwell GPUs enable massive parallel multimodal training. Apple Neural Engines optimise multimodal inference on consumer devices. Qualcomm Snapdragon AI chips power real-time audio and video AI on mobile platforms. This hardware evolution made previously theoretical capabilities commercially viable.
Real-time audio processing represents one of the most lucrative domains unlocked by recent model advances. The global AI voice generators market was worth $4.9 billion in 2024 and is estimated to reach $6.40 billion in 2025, growing to $54.54 billion by 2033 at a 30.7% CAGR. Voice AI agents alone will account for $7.63 billion in global spend by 2025, with projections reaching $139 billion by 2033.
The speech and voice recognition market was valued at $15.46 billion in 2024 and is projected to reach $19.09 billion in 2025, expanding to $81.59 billion by 2032 at a 23.1% CAGR. The audio AI recognition market was estimated at $5.23 billion in 2024 and projected to surpass $19.63 billion by 2033 at a 15.83% CAGR.
Integrating 5G and edge computing presents transformative opportunities. 5G's ultra-low latency and high-speed data transmission enable real-time sound generation and processing, whilst edge computing ensures data is processed closer to the source. This opens possibilities for live language interpretation, immersive video games, interactive virtual assistants, and real-time customer support systems.
The Banking, Financial Services, and Insurance sector represents the largest industry vertical, accounting for 32.9% of market share, followed by healthcare, retail, and telecommunications. Enterprises across these sectors rapidly deploy AI-generated voices to automate customer engagement, accelerate content production, and localise digital assets at scale.
Global content distribution creates another high-impact application. Voice AI enables real-time subtitles across more than 50 languages with sub-two-second delay, transforming how content reaches global audiences. The media and entertainment segment accounted for the largest revenue share in 2023 due to high demand for innovative content creation. AI voice technology proves crucial for generating realistic voiceovers, dubbing, and interactive experiences in films, television, and video games.
Smart devices and the Internet of Things drive significant growth. Smart speakers including Amazon Alexa, Google Home, and Apple HomePod use audio AI tools for voice recognition and natural language processing. Modern smart speakers increasingly incorporate edge AI chips. Amazon's Echo devices feature the AZ2 Neural Edge processor, a quad-core chip 22 times more powerful than its predecessor, enabling faster on-device voice recognition.
Geographic distribution of revenue shows distinct patterns. North America dominated the Voice AI market in 2024, capturing more than 40.2% of market share with revenues amounting to $900 million. The United States market alone reached $1.2 billion. Asia-Pacific is expected to witness the fastest growth, driven by rapid technological adoption in China, Japan, and India, fuelled by increasing smartphone penetration, expanding internet connectivity, and government initiatives promoting digital transformation.
Recent software developments encompass real-time language translation modules and dynamic emotion recognition engines. In 2024, 104 specialised voice biometrics offerings were documented across major platforms, and 61 global financial institutions incorporated voice authentication within their mobile banking applications. These capabilities create entirely new business models around security, personalisation, and user experience.
AI video generation represents another domain where rapid model improvements have unlocked substantial commercial opportunities. The technology enables businesses to automate video production at scale, dramatically reducing costs whilst maintaining quality. Market analysis indicates that the AI content creation sector will see a 25% compound annual growth rate through 2028, as forecasted by Statista. The global AI market is expected to soar to $826 billion by 2030, with video generation being one of the biggest drivers behind this explosive growth.
Marketing and advertising applications demonstrate immediate return on investment. eToro, a global trading and investing platform, pioneered using Google's Veo to create advertising campaigns, enabling rapid generation of professional-quality, culturally specific video content across the global markets it serves. Businesses can generate multiple advertisement variants from one creative brief and test different hooks, visuals, calls-to-action, and voiceovers across Meta Ads, Google Performance Max, and programmatic platforms. For example, an e-commerce brand running A/B testing on AI-generated advertisement videos for flash sales doubled click-through rates.
Corporate training and internal communications represent substantial revenue opportunities. Synthesia's most popular use case is training videos, but it's versatile enough to handle a wide range of needs. Businesses use it for internal communications, onboarding new employees, and creating customer support or knowledge base videos. Companies of every size (including more than 90% of the Fortune 100) use it to create training, onboarding, product explainers, and internal communications in more than 140 languages.
Business applications include virtual reality experiences and training simulations, where Veo 2's ability to simulate realistic scenarios can cut costs by 40% in corporate settings. Traditional video production may take days, but AI can generate full videos in minutes, enabling brands to respond quickly to trends. AI video generators dramatically reduce production time, with some users creating post-ready videos in under 15 minutes.
Educational institutions leverage AI video tools to develop course materials that make abstract concepts tangible. Complex scientific processes, historical events, or mathematical principles transform into visual narratives that enhance student comprehension. Instructors describe scenarios in text, and the AI generates corresponding visualisations, democratising access to high-quality educational content.
Social media content creation has become a major use case. AI video generators excel at generating short-form videos (15 to 90 seconds) for social media and e-commerce, applying pre-designed templates for Instagram Reels, YouTube Shorts, or advertisements, and synchronising AI voiceovers to scripts for human-like narration. Businesses can produce dozens of platform-specific videos per campaign with hook-based storytelling, smooth transitions, and animated captions with calls-to-action. For instance, a beauty brand uses AI to adapt a single tutorial into 10 personalised short videos for different demographics.
The technology demonstrates potential for personalised marketing, synthetic media, and virtual environments, indicating a major shift in how industries approach video content generation. On the marketing side, AI video tools excel in producing personalised sales outreach videos, B2B marketing content, explainer videos, and product demonstrations.
Marketing teams deploy the technology to create product demonstrations, explainer videos, and social media advertisements at unprecedented speed. A campaign that previously required weeks of planning, shooting, and editing can now generate initial concepts within minutes. Tools like Sora and Runway lead innovation in cinematic and motion-rich content, whilst Vyond and Synthesia excel in corporate use cases.
Whilst audio and video capabilities create new customer-facing applications, multi-reference systems built on Retrieval-Augmented Generation have become critical for enterprise internal operations. RAG has evolved from an experimental AI technique to a board-level priority for data-intensive enterprises seeking to unlock actionable insights from their multimodal content repositories.
The RAG market reached $1.85 billion in 2024 and is growing at 49% CAGR, with organisations moving beyond proof-of-concepts to deploy production-ready systems. RAG has become the cornerstone of enterprise AI applications, enabling developers to build factually grounded systems without the cost and complexity of fine-tuning large language models. The RAG market is expanding with 44.7% CAGR through 2030.
Elastic Enterprise Search stands as one of the most widely adopted RAG platforms, offering enterprise-grade search capabilities powered by the industry's most-used vector database. Pinecone is a vector database built for production-scale AI applications with efficient retrieval capabilities, widely used for enterprise RAG implementations with a serverless architecture that scales automatically based on demand.
Ensemble RAG systems combine multiple retrieval methods, such as semantic matching and structured relationship mapping. By integrating these approaches, they deliver more context-aware and comprehensive responses than single-method systems. Various RAG techniques have emerged, including Traditional RAG, Long RAG, Self-RAG, Corrective RAG, Golden-Retriever RAG, Adaptive RAG, and GraphRAG, each tailored to different complexities and specific requirements.
The interdependence between RAG and AI agents has deepened considerably, whether as the foundation of agent memory or enabling deep research capabilities. From an agent's perspective, RAG may be just one tool among many, but by managing unstructured data and memory, it stands as one of the most fundamental and critical tools. Without robust RAG, practical enterprise deployment of agents would be unfeasible.
The most urgent pressure on RAG today comes from the rise of AI agents: autonomous or semi-autonomous systems designed to perform multistep processes. These agents don't just answer questions; they plan, execute, and iterate, interfacing with internal systems, making decisions, and escalating when necessary. But these agents only work if they're grounded in deterministic, accurate knowledge and operate within clearly defined guardrails.
Emerging trends in RAG technology for 2025 and beyond include real-time RAG for dynamic data retrieval, multimodal content integration (text, images, and audio), hybrid models combining semantic search and knowledge graphs, on-device AI for enhanced privacy, and RAG as a Service for scalable deployment. RAG is evolving from simple text retrieval into multimodal, real-time, and autonomous knowledge integration.
Key developments include multimodal retrieval. Rather than focusing primarily on text, AI will retrieve images, videos, structured data, and live sensor inputs. For example, medical AI could analyse scans alongside patient records, whilst financial AI could cross-reference market reports with real-time trading data. This creates opportunities for systems that reason across diverse information types simultaneously.
Major challenges include high computational costs, real-time latency constraints, data security risks, and the complexity of integrating multiple external data sources. Ensuring seamless access control and optimising retrieval efficiency are also key concerns. The deployment of RAG in enterprise systems addresses practical challenges related to retrieval of proprietary data, security, and scalability. Performance is benchmarked on retrieval accuracy, generation fluency, latency, and computational efficiency. Persistent challenges such as retrieval quality, privacy concerns, and integration overhead remain critically assessed.
The competitive landscape created by rapid model releases shows no signs of stabilising. In 2025, three names dominate the field: OpenAI, Google, and Anthropic. Each is chasing the same goal: building faster, safer, and more intelligent AI systems that will define the next decade of computing. The leapfrogging pattern, where one vendor's release immediately becomes the target for competitors to surpass, has become the industry's defining characteristic.
For startups, the challenge is navigating intense competition in every niche whilst managing the technical debt of constant model updates. The positive developments around open models and reduced training costs democratise access, but talent scarcity, infrastructure constraints, and regulatory complexity create formidable barriers. Success increasingly depends on finding specific niches where AI capabilities unlock genuine value, rather than competing directly with incumbents who can absorb switching costs more easily.
For enterprises, the key lies in treating AI as business transformation rather than technology deployment. The organisations achieving meaningful returns focus on specific business outcomes, implement robust governance frameworks, and build flexible architectures that can adapt as models evolve. Abstraction layers and unified APIs have shifted from nice-to-have to essential infrastructure, enabling organisations to benefit from model improvements without being held hostage to any single vendor's deprecation schedule.
The specialised capabilities in audio, video, and multi-reference systems represent genuine opportunities for new revenue streams and operational improvements. Voice AI's trajectory from $4.9 billion to projected $54.54 billion by 2033 reflects real demand for capabilities that weren't commercially viable 18 months ago. Video generation's ability to reduce production costs by 40% whilst accelerating campaign creation from weeks to minutes creates compelling return on investment for marketing and training applications. RAG systems' 49% CAGR growth demonstrates that enterprises will pay substantial premiums for AI that reasons reliably over their proprietary knowledge.
The treadmill won't slow down. If anything, the pace may accelerate as models approach new capability thresholds and vendors fight to maintain competitive positioning. The organisations that thrive will be those that build for change itself, creating systems flexible enough to absorb improvements whilst stable enough to deliver consistent value. In an industry where the cutting edge shifts monthly, that balance between agility and reliability may be the only sustainable competitive advantage.

Tim Green UK-based Systems Theorist & Independent Technology Writer
Tim explores the intersections of artificial intelligence, decentralised cognition, and posthuman ethics. His work, published at smarterarticles.co.uk, challenges dominant narratives of technological progress while proposing interdisciplinary frameworks for collective intelligence and digital stewardship.
His writing has been featured on Ground News and shared by independent researchers across both academic and technological communities.
ORCID: 0009-0002-0156-9795 Email: tim@smarterarticles.co.uk
from
Justina Revolution
I went out into the cool evening air and did my 5 phase routine. I loosened my body. I did Cosmos Palm. (This is my signature qigong for power training sequence.) I then did my Swimming Dragon Baguazhang. God that form feels so good. I did Fut Gar and White Crane earlier today. So it has been a very complete workout day.
Tomorrow I will be talking to Dillon and Bre and then I will call Dr. Abad and get her help to get myself vaccinated so I can send Aldo the cards necessary for our residency. This is a good thing. I think things will work out.
I want to maybe play with makeup later but I never do.
The question becomes what nourishes me? What drains me?

I was first really exposed to the Christian commemorations of the Holy Innocents thanks to a church name. Holy Innocents Episcopal Church outside Atlanta, to be exact. I visited that church and liked the architecture and liturgy and it inspired me to learn more about a story I had known since childhood but seldom dwelt on—much less saw as a focus of devotion.
It’s a story that largely gets left out of our Christmas commemorations in the Episcopal Church, partly because it is such a horrible story (and likely partly due to the more modern doubt of the story’s historical accuracy, which we’ll talk about in a bit). No one wants to follow up Christmas morning with a service about the mass-murdering of children.
At the same time, especially this year, this is a highly relevant story. Tragically, all over the world, politicians are playing like Herod and systematically executing anyone they deem a threat—including children.
Holy Innocents, also known Childermas, commemorates an event that, in all likelihood, never happened. Josephus, an important Jewish historian, took great care to showcase the brutalities of the Herodians and never once mentioned a mass slaughter of children. Outside of the gospel of Matthew there are no other historical accounts of this story and it seems likely to be something meant by the evangelist as a means to make connections between Jesus and Moses, a common theme throughout that particular gospel. So what are we to make of this fact? That we not only have a day marked on our calendar but also name churches and schools for an event that probably never happened?
This is one of the tough parts of reading the Bible. It’s not always “factual” in the ways to which we are accustomed today. Nevertheless, elements that we deem “fictional” can have a huge impact on our faith and wind up speaking Truth despite their (in)accuracy.
Consider the typical Christmas pageant. Aside from Mary, Joseph, a baby, angels, and some shepherds most of the story we dramatize is completely fictional and not related to what is written in the Bible. We tend to think of the birth of Jesus as being an event that culminates after Mary and Joseph, alone on a donkey, have gone to every house or inn in Bethlehem and been told “no vacancy” and so set up shop in a nearby stable. But none of the gospels mention a donkey and we’re only told that there was no room in “the inn”—nothing at all about conversations with inn-keepers or a door-to-door journey. Further, given the nature of the census, there was probably a caravan of people traveling to Bethlehem and others taking residence among the livestock because Bethlehem was not prepared for such an influx of extra people. What we think of when we think of the Christmas story is largely fictional, but that doesn’t mean there’s not truth in those elements. We crafted those details over the centuries in order to “flesh out” the story a bit, to give it the sort of texture that it invites. And those added details speak much of the faith and mindset of the church that crafted them.
The same is true of the Massacre of the Innocents. It might not have happened, but it’s very telling that no one finds the story improbable. There might not be any records to back it up, but the story sounds like the sort of thing Herod would have done—indeed, the sort of thing that rulers all over the world and all over our history books have done.
The sort of government that gleefully cancels aid and assistance to poor countries is acting like Herod. The one that uses starvation, particularly of children, as a weapon of retaliation is acting like Herod. The political entities that travel throughout villages to murder women and children are the ones acting like Herod.
The actual Herod may not have ordered a campaign to murder the children of Bethlehem out of some fear of losing power, but Herod for sure murdered plenty of children and other innocents during his reign out of a sense that because he was in charge he could do so—without any fear of God. And in this, Herod is an archetype. Plenty of gilded so-called rulers kill innocents in the name of preserving their name on the side of buildings. If they were honest, they do so out of a desire to kill the God that they are not.
Yesterday’s saint records Jesus saying “If the world hates you, know that it hated me first.” The poet Dianne di Prima says in her poem, “Rant” that “the only war that matters is the war against the imagination, all other wars are subsumed by it.” I tend to think that it’s more the case that all hatred is subsumed in hatred for Jesus and, therefore, all wars are the Battle of Armageddon, the war against Christ Himself.
If the story behind Holy Innocents is fictional, then it is worth asking what it is we’re commemorating this day. I think the answer is simple: Holy Innocents commemorates all children sacrificed on the altar of expedience or inconvenience by those in power attempting to cast themselves as gods. Those killed by starvation from the abrupt end to programs like USAID or in Gaza by the Israeli government. Those killed by radicals in Somalia and Sudan. Those dying thanks to bombs dropped on Ukraine. And that’s only looking at what’s currently happened in the news in recent weeks. These are who we commemorate on Holy Innocents. The gospel story is subsumed in the stories we see right now, and is itself reflective of those stories. The gospel story helps us Christians see the shape of the story happening around us, helps us in remembering where our allegiance lies.
Herod is the one who oversees the death of innocents. Christ is the one who sees them as holy.
***
The Rev. Charles Browning II is the rector of Saint Mary’s Episcopal Church in Honolulu, Hawai’i. He is a husband, father, surfer, and frequent over-thinker. Follow him on Mastodon and Pixelfed.
#Christmas #HolyInnocents #History #Theology #Church #Christianity #War #Gaza #Ukraine

OSRIC 3.0 Player Guide PDF has just been released for free on DriveThruRPG. Offset print and print-on-demand will be available next year, as well as GM Guide, adventures, and a host of other material.
OSRIC, Old School Reference and Index Compilation, was the first retroclone of Advanced Dungeons & Dragons. Released almost 20 years ago, it led the charge during early days of OSR, providing means to legally publish content compatible with AD&D.
OSRIC 3.0 brings a host of improvements, focusing on providing more explanations and examples of play, replacing dense blocks of text with more accessible layout, discards OGL, and brings the rules even closer to AD&D, just to name a few.
Learn more about OSRIC 3.0 on BackerKit.
#News #OSRIC #OSR
from
Zéro Janvier
Le jeu du cormoran est le quatrième roman appartenant au cycle romanesque Le Rêve du Démiurge de Francis Berthelot.
La fresque se poursuit puisqu’on retrouve des personnages déjà croisés dans les romans précédents : Ivan Algeiba, que l’on avait aperçu jeune garçon de cirque dans Le jongleur interrompu et qui est désormais un jeune homme ; Tom-Boulon, le régisseur du théâtre du Dragon, et Katri, l’ancienne actrice qui a retrouvé sa passion pour le chant, que l’on vient tous deux de quitter dans le roman précédent, Mélusath ; et ce cormoran qui donne son titre au roman, serait-il la réincarnation de Constantin, le jongleur qu’Ivan adorait et qui croyait si fort à la légende de l’île mythique où les âmes défuntes renaissent en oiseaux ?
On découvre également d’autres personnages, comme Moa-Toa, jeune asiatique androgyne, au sexe indéterminé, et un mystérieux inconnu aux yeux d'un bleu de métal qui semble pourchasser Ivan et faire appel à ses pires passions.
Le récit se déroule en 1974. Ivan quitte le cirque où il a passé son enfance et son adolescence et fait la connaissance de Moa-Toa et de Tom-Boulon. Guidés par le cormoran, ils vont accomplir un voyage depuis les Landes jusqu’à Paris, puis en Finlande. Les étapes et la destination permettront à chacun d’affronter leur passé et, peut-être, de trouver des réponses aux questions qu’ils se posent.
Dans la lignée des trois premiers romans, Francis Berthelot propose un récit sensible, empreint de symbolisme, avec une touche de fantastique qui s’affirme à chaque roman.
from Douglas Vandergraph
There is a particular kind of sorrow that only parents know, and it rarely announces itself loudly. It doesn’t arrive as a dramatic rupture or a single defining argument. It shows up quietly, over time, in small moments that sting more than they should. A conversation that ends too quickly. A look that feels distant. A realization, sudden and unsettling, that your child does not see you the way you see yourself.
You know you are a good parent. Not perfect, but sincere. You showed up. You worked hard. You tried to be consistent. You tried to love well. And yet, somewhere along the way, your child’s understanding of who you are drifted from your own. At the same time, if you are honest enough to sit with the discomfort, you may sense that you no longer fully understand who they are either.
This is not failure. But it feels like it.
Modern conversations about parenting often oversimplify this tension. They frame it as rebellion versus authority, values versus culture, obedience versus freedom. But real family dynamics are rarely that clean. What most parents and children experience is not rejection but misalignment. Not hatred but confusion. Not abandonment but distance.
Scripture does not shy away from this reality. In fact, the Bible may be the most honest book ever written about family tension. From Genesis onward, it tells the truth about how love can exist alongside misunderstanding, how faith can coexist with fracture, and how God works patiently within relationships that feel strained beyond repair.
The first thing we must acknowledge—without defensiveness or shame—is this: love does not automatically produce understanding. Love can be real, sacrificial, and enduring, and still fail to communicate itself clearly across generational lines. Even God, who loves perfectly, is consistently misunderstood by His own children. That truth alone should humble us and free us at the same time.
Parents often assume that because their intentions were good, their impact must have been clear. Children often assume that because they felt misunderstood, their parents must not have cared. Both assumptions can be wrong simultaneously. This is where the gap forms—not in malice, but in misinterpretation.
One of the most difficult truths for parents to accept is that their children experience them not through intention but through perception. Children do not live inside their parents’ internal reasoning. They interpret tone, timing, emotional availability, and response. A parent may believe they were protecting. A child may have experienced that protection as control. A parent may believe they were guiding. A child may have experienced that guidance as pressure.
Neither story cancels the other. Both deserve to be heard.
Jesus understood this dynamic deeply. Throughout the Gospels, He is constantly misunderstood—by religious leaders, by crowds, even by His own disciples. Yet His response is never contempt. He does not shame misunderstanding out of people. He meets confusion with patience, distance with presence, and fear with truth spoken gently enough to be received.
That model matters profoundly for parents who want to rebuild trust.
One of the most subtle dangers in parent-child relationships is confusing moral responsibility with relational dominance. Parents are indeed responsible for guiding, protecting, and teaching. But authority that is not tempered by humility eventually creates silence. And silence is where distance grows unnoticed.
When children feel that disagreement threatens connection, they stop speaking honestly. When parents feel that questioning undermines authority, they stop listening openly. Over time, both sides retreat into assumptions instead of conversations.
This is why Scripture emphasizes listening so strongly. “Everyone should be quick to listen, slow to speak, and slow to become angry.” That instruction is not about winning debates. It is about preserving relationship. Listening communicates safety. It tells the other person, “You are not at risk simply because you are honest.”
For many parents, this is the most uncomfortable shift of all. Listening can feel like surrender. Curiosity can feel like compromise. Asking questions can feel like weakness. But in the Kingdom of God, humility is never weakness. It is strength under control.
Consider how Jesus handled those who disagreed with Him. He did not flatten them with superior arguments, even though He could have. He asked questions that exposed hearts rather than silencing voices. He told stories that invited reflection rather than forcing compliance. He created space where transformation could happen organically.
Parents who want to bridge the gap must learn to do the same.
This does not mean abandoning convictions. It means releasing urgency. Urgency communicates fear, and fear closes hearts. Presence communicates love, and love opens doors that arguments cannot.
Children often need time to articulate what they are feeling, especially when their internal world does not yet have language. When parents rush to correct before understanding, children hear one message above all others: “Your confusion is dangerous.” That message may not be intended, but it is often received.
And once a child feels that their questions are unsafe, they will search for answers elsewhere.
Another hard truth parents must face is that children do not always push away because of disagreement. Sometimes they pull away because they are exhausted from trying to be understood. Emotional distance is often a form of self-protection, not rebellion.
This is where faith calls parents to something higher than instinct. Instinct says, “Push harder.” Faith says, “Stand steadier.” Instinct says, “Fix this now.” Faith says, “Trust God with the process.”
The Bible’s most famous family reconciliation story—the Prodigal Son—is often misunderstood. The father does not chase his son down the road. He does not lecture him from a distance. He does not demand repentance as a prerequisite for love. But neither does he approve of his choices. He stays present. He stays open. He stays himself.
That posture is far more difficult than control. It requires confidence in identity rather than confidence in outcomes.
Parents who are secure in who they are do not need their children to validate them. They do not need immediate agreement to feel successful. They do not panic when seasons change. They understand that formation is a long process, and that God often does His deepest work underground, long before fruit becomes visible.
At the same time, children often underestimate the vulnerability of their parents. Parents are not fixed monuments. They are human beings shaped by their own histories, limitations, and unhealed places. Many parents parent the way they were parented—not because it was perfect, but because it was familiar.
This does not excuse harm. But it does explain complexity.
When children see their parents only as authority figures, resentment grows. When parents see their children only as extensions of themselves, disappointment grows. The bridge between them is built when both sides recognize the full humanity of the other.
God consistently works through this recognition. He reminds parents that their children ultimately belong to Him, not to parental expectation. He reminds children that honoring parents does not mean losing oneself. It means acknowledging the role love played in their becoming.
One of the most powerful moments in any family’s healing journey is when a parent can say, sincerely, “I may not have understood you as well as I thought I did.” That sentence does not erase the past, but it reframes the future. It signals safety. It invites conversation. It lowers defenses.
Likewise, one of the most powerful moments for children is recognizing that their parents’ failures were not proof of indifference, but evidence of limitation. This realization does not erase pain, but it creates room for compassion.
Faith does not demand that families pretend nothing hurts. Faith gives families the courage to name pain without letting it define the relationship.
Reconciliation, when it comes, rarely arrives as a dramatic reunion. More often, it arrives quietly. In a conversation that lasts a little longer than expected. In a question asked without accusation. In a moment where listening replaces defensiveness.
God works in those moments.
He works in the patience it takes to stay available when you feel misunderstood. He works in the humility it takes to admit you may not have all the answers. He works in the restraint it takes not to force growth before it is ready.
Parents who want to bridge the gap must release the illusion that they can control their children’s development. Control produces compliance at best. It never produces intimacy. God is after intimacy.
Children grow best in environments where love is secure enough to withstand difference. Parents become most influential when they stop trying to manage outcomes and start modeling character.
And this is where hope enters the story.
No family relationship is beyond redemption. Not because everyone will eventually agree, but because God is always at work beneath the surface. He is patient. He is creative. He specializes in restoring what feels irreparably fractured.
If you are a parent standing on one side of this gap, feeling uncertain and tired, know this: your consistency matters. Your willingness to listen matters. Your decision to remain a refuge matters.
If you are a child standing on the other side, feeling unseen or misunderstood, know this: your voice matters. Your journey matters. Your parents’ limitations do not negate the love that shaped you.
The bridge between you is not built all at once. It is built plank by plank. Conversation by conversation. Prayer by prayer.
And God is faithful to walk that bridge with you—even when you do not yet see the other side.
What makes this season between parent and child so spiritually demanding is that it forces us to confront a truth we would rather avoid: love that cannot tolerate misunderstanding is fragile. Love that collapses when it is not mirrored, affirmed, or understood has become transactional without realizing it. God’s love is not like that, and He invites parents to reflect something sturdier, something slower, something deeper.
One of the reasons the gap between parents and children widens is because both sides begin narrating the relationship internally without checking those narratives against reality. Parents quietly tell themselves, “My child doesn’t appreciate what I sacrificed,” while children quietly tell themselves, “My parent never really saw me.” Over time, these internal stories harden into assumed truth. Conversations become filtered through suspicion instead of curiosity. Every interaction feels loaded, even when no harm is intended.
Faith calls us to interrupt those stories before they become walls.
Scripture repeatedly shows that God is less concerned with how quickly understanding arrives and more concerned with whether hearts remain soft while waiting. Hardened hearts break relationships. Soft hearts allow time to do its work.
Parents often underestimate how much their emotional posture sets the climate of the relationship. Children are remarkably sensitive to emotional undercurrents. They may not articulate it clearly, but they feel when love is conditional, when disappointment lingers unspoken, when approval is tied to agreement. Even silence carries meaning.
This is why the ministry of presence is so powerful. Presence does not require fixing. It requires availability. It says, “You are welcome here even when we don’t agree.” That message does not weaken parental influence. It strengthens it.
Jesus never competed with the pace of people’s growth. He trusted the Father with timing. He understood that transformation forced is transformation aborted. Parents who rush their children’s spiritual, emotional, or ideological development often end up delaying it.
There is a deep irony here. The very pressure parents apply in the name of faith can sometimes push children further from it. Not because faith is flawed, but because fear has distorted how it is presented. When faith feels like surveillance rather than sanctuary, children associate God with anxiety instead of refuge.
This is not an accusation. It is an invitation to reflection.
Parents are often carrying unspoken fears. Fear that their child will suffer. Fear that mistakes will become permanent. Fear that distance will become loss. Fear that they will be judged for their child’s choices. These fears are understandable, but when left unchecked, they masquerade as control.
Faith does not eliminate fear automatically. Faith teaches us where to place it.
When parents entrust their children to God daily—not abstractly, but intentionally—they begin to loosen their grip without disengaging their love. They move from managing outcomes to modeling trust. Children notice this shift, even if they cannot name it.
Another essential truth is this: reconciliation does not require rewriting history. Healing does not mean pretending harm never occurred. It means choosing not to weaponize the past against the future.
Some parents hesitate to reopen conversations because they fear being blamed. Some children hesitate because they fear being dismissed. Both fears are valid. Both must be surrendered if the relationship is to move forward.
Jesus never denied people’s pain, but He also refused to let pain become the final authority. He acknowledged wounds without letting them define identity. Parents and children must learn to do the same.
One of the most healing moments in any family is when both sides stop arguing about who was right and start asking what was missing. Often what was missing was language. Or safety. Or time. Or emotional literacy. Or simply the ability to say, “I don’t know how to do this well, but I’m trying.”
That honesty disarms defensiveness.
Children, though this article speaks primarily to parents, must also be invited into responsibility. Growing into adulthood includes the difficult work of separating intention from impact without erasing either. Parents are not villains for being limited. They are human beings who carried weight long before their children were aware of it.
Honoring parents does not mean suppressing your voice. It means refusing to reduce them to their worst moments. It means acknowledging the love that existed even when it was imperfectly expressed.
At the same time, parents must release the desire to be fully understood before extending grace. Waiting for perfect understanding before offering love is another form of control. God did not wait for humanity to understand Him before loving fully. He moved first.
This is the pattern families are invited into.
Rebuilding trust often happens indirectly. Shared experiences matter more than forced conversations. Consistency matters more than speeches. Tone matters more than theology in moments of tension. Children remember how they felt long after they forget what was said.
Parents who want to remain influential must become emotionally predictable in the best sense of the word. Calm instead of reactive. Curious instead of defensive. Grounded instead of anxious. This stability creates a relational anchor children can return to when the world becomes overwhelming.
God often uses seasons of distance not to punish families, but to mature them. Distance reveals what was previously hidden. It surfaces assumptions. It exposes dependencies. It invites growth that proximity sometimes prevents.
This does not mean distance is ideal. It means it can be redemptive when surrendered to God.
Prayer becomes especially important in these seasons—not as a tool to change the other person, but as a posture that changes us. Parents who pray honestly often discover that God addresses their fears before He addresses their child’s behavior. Children who pray honestly often discover compassion for parents they once saw only as obstacles.
God is deeply invested in reconciliation, but His definition of reconciliation is broader than immediate harmony. He is building resilience, patience, humility, and love that can survive difference.
Families often want closure. God often offers transformation instead.
The bridge between parent and child is rarely rebuilt through one decisive conversation. It is rebuilt through dozens of ordinary interactions handled with care. A question asked gently. A boundary respected. A moment of humor that breaks tension. A silence that is not hostile but restful.
These moments accumulate. They matter.
If you are a parent reading this and grieving the distance, know that staying open is an act of courage. Refusing to withdraw emotionally even when you feel misunderstood is holy work. Remaining available without becoming intrusive is not weakness; it is wisdom.
If you are a child reading this and carrying unresolved pain, know that your healing does not require erasing your story. It requires refusing to let pain define the entire relationship. Compassion does not excuse harm, but it does loosen bitterness’s grip.
God is patient with families. He is not surprised by generational tension. He has been working within it since the beginning of time.
The goal is not to return to what was. The goal is to build something truer going forward—something marked by mutual dignity, spiritual humility, and love that does not panic when understanding is incomplete.
Faith does not promise that families will always agree. It promises that love does not have to disappear when they don’t.
The bridge is still buildable.
Not because everyone is ready. Not because everything is resolved. But because God is still present.
And presence, sustained over time, changes everything.
Your friend, Douglas Vandergraph
Watch Douglas Vandergraph’s inspiring faith-based videos on YouTube https://www.youtube.com/@douglasvandergraph
Support the ministry by buying Douglas a coffee https://www.buymeacoffee.com/douglasvandergraph
#FaithAndFamily #ParentingWithGrace #FaithBasedEncouragement #HealingRelationships #ChristianReflection #FamilyRestoration #SpiritualGrowth
from Silicon Seduction
Linda Lovelace (nee Boreman) cycled through so many personas in her short life that it is difficult to place her in the pop culture landscape.

Is she the effervescent star of the adult movie Deep Throat (1972), ushering in a new era of sexual liberation and porn chic? Or is she the damaged and abused woman who narrates Ordeal (1980), the third memoir published under her assumed name? Perhaps she is the human equivalent of the pet rock, a highly profitable (though not for her) spasm of pop silliness that worked for a nanosecond but is inexplicable now.
Something about this woman and her story does not track. She comes across as naïve, victimized, cynical, sincere, cloying, pathetic, and brave, often in the space of just a few short quotes. Testimony from colleagues who were with her on movie sets or who knew her husband, porn impresario Chuck Traynor, only further complicates things. Half of them seem to believe she was a victim of sustained abuse while others dismiss her as a practiced liar.
For all that—maybe because of that—I think she still matters. If I am asked to take Hugh Hefner seriously, as the documentary Secrets of the Playboy Mansion plausibly does, then Linda Lovelace merits attention, as well. Her once bright then strained smile flashes like a neon sign above the pornified world we live in today.
Premiering in Times Square on June 12, 1972, this silly and implausible movie bottled lightning by centering on a female character at the exact moment second-wave feminism broke through the cultural gates. The original poster zings with 1970s energy, from the “have a nice day” yellow background to the photo of Linda looking slim, happy, and satisfied. (It is a subject for another blog, but 1970s thin people were thin in a way that even thin people are not, anymore).
Though the fantasies the movie prioritized were still decidedly heterosexual male, phrases such as the “girl next door” and “nice girls like sex, too” repeated on a loop as feminists, critics, comedians, and even The New York Times propelled the movie to a top 10 box-office hit. Most scholars today estimate that the film has grossed at least $600 million worldwide, on an initial budget of approximately $30,000. It is credited with bringing adult movies to the mainstream and scaffolding the Golden Era of Porn.
Linda Lovelace always admitted that she enjoyed her moment of fame. The movie launched her onto talk shows and to parties at the Playboy Mansion, placing her among a richer and seemingly more refined class of people (however ersatz) than those she had known on the stag film circuit. Her fluttery charm worked well on television, and she seemed to be in on the joke.
The shift from the let-it-all-hang-out 1970s to the buttoned-down 1980s happened so fast that it is no wonder Linda Lovelace got whiplash. Everyone did. Part of this had to do with a resurgent conservative movement that loathed almost everything that happened after The Beatles appeared on Ed Sullivan. When the first news article on the AIDs crisis appeared in The New York Times on July 3, 1981, you could almost hear the spinning mirrored disco balls grind to a halt overnight.
Yet the most important change for Lovelace may have been the way in which second wave feminism started to fragment into different strands, with sex-positive feminists like Camille Paglia pitted against anti-porn advocates such as Andrea Dworkin in the media. Much of America sat back and watched as feminism tore itself apart in the 1980s but Linda Lovelace stepped right into the fray.
I confess that when I first read Ordeal, my skepticism meter was running hot, even though I knew the publisher had required Lovelace to take a lie detector test before releasing the book. This wasn’t a Rashomon-style kaleidoscope of differing perspectives as it was a complete refutation of everything that had gone before. The Lovelace in these pages is victimized repeatedly and brutally, and is the brunt of the joke, never its author.
Lovelace ultimately testified before a Congressional committee investigating the porn industry in the 1980s that anyone who watched Deep Throat was watching her being raped. This statement, from the woman who embodied the sexual revolution, who had been bubbly and so much fun on the talk show circuit, caused the public to recoil. This was not a story that anyone really wanted to hear. Even Phil Donahue seemed to barely contain his contempt when she appeared on his show.
I have no idea if Linda Lovelace was or was not telling the whole truth in Ordeal, whether she enjoyed any of her scenes in Deep Throat, or if Chuck Traynor’s second marriage to porn star Marilyn Chambers was a happy one (implying that it was Linda who was the problem all along). I can easily believe that both Gloria Steinem and Andrea Dworkin genuinely liked Lovelace but also knew a useful cautionary tale when they read one.
Cultural divides as fraught as the one between the 1970s and the 1980s arrange stories along neat grids: in this case, either Linda Lovelace lied about her razzle-dazzle heyday, or she lied about being a victim of abuse and sexual trafficking. The parallel tracks of her story in these decades do not converge at any point, so we never quite know where we stand with her.
Yet perhaps we can look at this story another way, focusing less on the truth-of-the-matter and more on the culture that this hapless woman was trying to negotiate. Linda Lovelace might seem naïve, victimized, cynical, sincere, cloying, pathetic, and brave, but mostly she seems damaged. Here we have a neglected child from the lowest rung of the midcentury middle class, who grew up in an atmosphere of abuse juxtaposed with dreams of marriage and white picket fences. The first legal case of a man prosecuted for raping his wife will not take place until 1978 (Oregon v. Rideout), a full ten years after Linda meets Chuck Traynor. Post-traumatic stress disorder (PTSD) will not be added to the Diagnostic and Statistical Manual until 1980.
Linda Lovelace grew up in a world where concepts we take for granted had not yet been thought. This does not mean everyone gets a free pass, but it does mean that putting a 2025 lens on the early 1970s should be done with care. In her early 20s, she appears in a movie that even the director predicted would flame-out in a few days like virtually every other “dirty movie” before it but instead catapults her to Hollywood and The Tonight Show with Johnny Carson. The girl raised to please others above all other considerations is in the spotlight, is welcomed, is a star. In these strange circumstances, Linda Lovelace could have readily embraced the fact of starring in Deep Throat without having liked making it, at all.
Millions of people enjoyed the sexual revolution when it happened. Linda Lovelace was among the unlucky who were saddled with an indelible record of every single good and bad choice they made in those few frothy years. She also went broke and slid right back down the class ladder. There was no one to protect her, not even Chuck Traynor.
Jack Nicholson, Roman Polanski, Frank Sinatra, and Sammy Davis, Jr., among many others who enthusiastically embraced the porn chic culture of the 1970s, got to live full lives for four more decades. They were far more talented, no doubt, but on a human level that is not really the point, especially in the post #MeToo era. The point is they got to leave the 1970s behind. Linda Lovelace never did.
This is probably the main reason I think she matters today. Deep Throat may have been one of the first professional porn films, but Linda Lovelace was the ultimate amateur. And according to Pornhub, which keeps meticulous track of data on the adult industry, searches for “real amateur homemade” porn in 2022 grew by 310 percent in the United States and 169 percent worldwide.
So, when I do think about Linda Lovelace, I can’t help but wonder how many of these 19-year-old women may find themselves looking at things differently in, say, 2045, while all those early choices continue to circulate on a loop in digital space.