The Dynamics of Artificial Minds: Movement, Creativity, and Open-Ended Innovation

Colours School- Institut Pascal, June 2025

Denise Lanzieri

slides at denise-lanzieri-csl.github.io/colours_school/

Sony Computer Science Laboratories

Paris

1996
Rome

2021

Tokyo

1988
Kyoto

2020

Sony CSL is a framework to make the wildest ideas come true, for the future of humanity and the planet

Hiroaki Kitano, Sony CTO, President and CEO, Sony CSL

Research Lines

Infosphere

Tackling the challenge of redesigning Information Technologies to make information more accessible and social dialogue more transparent, understandable, and healthy.

Sustainable Cities

Aiming at providing new tools for understanding and monitoring urban environments in order to make them more sustainable.

Augmented Creativity

Studying the ability of AI to understand the complexity of open-ended systems, to support creativity and help people finding original brilliant, innovative solutions.

Illustration by Fernando Cobelo

Augmented creativity Team

Our mission is twofold:

Theoretical side: Use AI to discover original solutions to complex problems—fostering true creativity at the foundations.

Applicative side: Apply AI to augment human creativity in real-world contexts—empowering creators and innovators.

In this presentation, we will explore three exciting activities:

S+T+ARTS Projects: Our role in a European initiative at the intersection of science, technology, and the arts.

Movement Models: Developing AI models inspired by Large Language Models to learn and generate human movement.

CodeFest: A multi-week coding challenge series leveraging patent data to solve real-world problems.

S+T+ARTS Projects

S+T+ARTS

S+T+ARTS AIR
- Michail Rybakov $\to$ Measuring Personal Space in Cities: Developed the KI/s (Kinesphere Infringement per Second) metric to quantify crowd density effects on personal space using GPS data, real-world observations, and simulations.
- Filippo Gregoretti $\to$ Moral Values & Toxicity in Art: Exploring the transformation of moral values and toxicity into emotional audiovisual experiences generated by an emotional AI

S+T+ARTS Buen-TEK: Explores how indigenous knowledge and advanced technologies can work together to solve problems like climate change and pollution, and help create a stronger, more sustainable future.

Movement model

(Large) Movement Model

Analogy with LLMs: words $\Longrightarrow$ postures, phrases $\Longrightarrow$ movements

Posture quantization

Anticipation, prediction and comprehension of human movements

Smart Mirror: supporting the performer creativity by understanding her own way to be creative and suggesting new solutions on the fly

Roadmap:

Part I: Data Collection Pipeline

$\Longrightarrow$ Customized workshop to bootstrap our LMM

$\Longrightarrow$ Approaches based on deep learning to estimate human pose $\to$ sVision: Designed to identify and track human body poses in real-time.

Roadmap:

Part II: Construction of the AI-based model

$\Longrightarrow$ Selecting an appropriate deep neural network architecture to generate dance movements


								import tensorflow as tf
								import keras
								# Define the loss function for the model 
								loss = 'mean_squared_error'
								# Set the batch size 
								batch_size = 8								
								# Set the number of epochs
								epochs = 2000								
								# Set the learning rate to 0.01
								lr = 0.01								
								# Define the number of units (neurons) in each LSTM layer
								n_units = 1000															
								# Create a new Sequential model
								model = Sequential()								
								# Add the first LSTM layer to the model:
								model.add(LSTM(units=n_units, input_shape=(predictors_train.shape[1],
								moves_vocab_size), return_sequences=True, activation=activations.tanh))								
								# Add the second LSTM layer with similar parameters
								model.add(LSTM(units=n_units, return_sequences=True, 
								activation=activations.tanh))							
								# Add the third LSTM layer which outputs the last sequence
								model.add(LSTM(n_units, activation=activations.tanh))								
								# Add a Dense output layer that uses a sigmoid activation function
								model.add(Dense(moves_vocab_size, activation='sigmoid'))

Deterministic Baseline

In my initial exploration and model-building phases, I started to adopt LSTM :

The LSTM is designed to process:
- Each frame’s human position
- Frame’s acoustic features
$\Longrightarrow$ Outputs: Pose sequences

I am also testing other network, including Transformers, to evaluate performance and accuracy

Posture distribution at time T+1

Vector Quantization (VQ) with Self-Organizing Maps (SOM):

Probabilistic model

The models take as input movements from videos, captured as a sequence of body positions over time

The model map each data point to a node whose vector is closest to the input vector

A finite sequence of winning neurons will represent the body movement

$\Longrightarrow$ The SOM can be visualized as a "vocabulary" of human positions, where each neuron corresponds to a specific posture

Vector Quantization with SOM:

Next step? Integrating Novelties in Deep Learning Systems

The idea: Training algorithm inspired by Stuart Kauffman’s concept to explore new data spaces

The Dreaming Learning Algorithm:
- Initial training of a probabilistic network with Vector Quantization
- Dreaming Learning step: the network generates a new synthetic sequence
- The network is trained again using the synthetically generated sequences

$\to$ Enhance human creativity through dynamic human-machine interaction, fostering an evolving artistic partnership during performances.

Artistic experiment

CodeFest Spring 2025 on classifying patent data

Classifying Critical Raw Materials in Patents with LLMs

Scalable and fully reproducible methodology to map the role of Critical Raw Materials (CRMs) in patent-driven innovation and their alignment with the UN Sustainable Development Goals.

CRM-related patents are classified by the functional role of each material—use, refinement, recycling, or removal—linking these roles to broader trends in innovation, technology, and sustainability transitions.

My contribution:
- Train a masked language model (MLM) based on a pre-trained architecture (e.g., BERT for chemicals) using a large corpus of unlabeled patent abstracts
- Load the domain-specific model from Stage 1 and fine-tune it on a manually annotated dataset of CRM-related patent abstracts.

General Conclusion

As a former cosmologist, I never imagined I'd be working with dancers, artists, or urbanists.

But the mindset we develop during a PhD — critical thinking, abstraction, modeling, coding, resilience — is highly transferable.

You’re not locked into one domain. Your skills can help shape new fields and solve unexpected challenges.

Embrace uncertainty. Follow curiosity. That’s where real innovation begins.
Your training isn’t just about becoming an expert — it’s about learning how to explore the unknown.

APPENDIX

How to overcome motion capture limitations?

Approaches based on deep learning to estimate human pose

sVision: Designed to identify and track human body poses in real-time.

It works by detecting key points or landmarks on the human body, such as joints and other anatomical features.

These landmarks are then used to estimate the overall body pose, including the positions and orientations of body parts.

Simple setup, flexible, non-Intrusive, portable

Main Challenges of Pose Detection

(Illustration of the depth ambiguity (Li and Lee, 2019))

(Erroneous predictions due to self-occlusion (Shin and Halilaj, 2020))

3D poses $\Longrightarrow$ Using 2D joints to recover a 3D pose becomes an ill-defined problem as one 2D skeleton may correspond to many varied 3D poses.

Self-occlusion and dependence on the camera’s viewing: Might fail when visualizing a person in a pose where parts of their body obscure other parts

We need to evaluate the efficacy of video-based position extraction and identify systematic errors improving the reliability of the capturing system.

Towards a robust pipeline:

Future prospects: Extensions to general applications

Sport coaching and training

Physical rehabilitation

Speech-language diseases related to facial expressions

Micro-expressions detection

Improving human-like motions in robotic systems (movement fine-tuning)

Human-computer interface based on gestures and movements

Methodology

Semantic matching between CPC classes and SDGs
We compute cosine similarity between CPC subclass titles and SDG targets using pre-trained sentence transformer models, establishing a conceptual bridge between patent classification and sustainability objectives.

Keyword search of CRMs in patent abstracts
We identify CRM mentions using a curated set of keywords and element symbols, filtered to minimise false positives and aligned with the latest EU CRM list (2023).

Functional classification of CRM roles
CRM–patent pairs are categorised into five functional roles: use, refine, recycle, remove, or wrong (non-functional mention). This classification reflects distinct innovation strategies across the CRM lifecycle.

Fine-tuned LLM classification
A BERT-based language model, adapted to the patent domain and fine-tuned on over 11,000 labelled examples, classifies CRM functions with 94% accuracy, outperforming rule-based approaches and enabling functional interpretation at scale.

## 1. Clone the Repository
							
							```bash
							git clone git@github.com:epo/CFS25-Material-Decoders.git
							cd Source\ Code/scripts/
							```
							
							
							## 2. ⚙️ Preprocessing
							
							###   2.1 Text Similarity between CPC codes and UN SDGs 
							
							Run the notebook [TextSimilarity_CPC_SDGs.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/TextSimilarity_CPC_SDGs.ipynb).
							
							This module computes semantic similarity between CPC technology codes and Sustainable Development Goals (SDGs)
							
							###   2.2 Patent Dataset Download and Saving 
							
							Run the notebook [Download_Patent_Data.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/Download_Patent_Data.ipynb).
							
							This notebook will create the patent dataset needed for the first step of the fine tuning. The dataset we used is downloaded in the next step
							The goal is to build a file of approximately 8 million patents with patent abstract and CPC codes. 
							
							###  2.3 Text Mining and Collecting Metadata 
							
							Run the notebook [TextMining_and_CollectingMetadata.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/TextMining_and_CollectingMetadata.ipynb).
							
							This notebook performs the keyword search pre-filtering of the patents in PATSTAT in order to identify the critical raw materials (CRM) related patents–i.e. the patents with at least one mention of a CRM associated keyword.
							<br> Additionally, once the CRM-related patents are identified, we collect metadata on CPC codes and countries of inventors and applicants.

## 3. 🧹 Prepare the Data for Training
							
							### 3.1 Download Training data
							Run the following shell [script](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/scripts/download_data_training.sh) to download the necessary datasets:
							```bash
							bash download_data_training.sh
							```
							### 3.2 Tokenize Training Data
							Run the following notebooks in order to clean and tokenize the dataset:
							- [tokenization_4_fine_tuning_1.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/tokenization_4_fine_tuning_1.ipynb): 
							- Cleans patent abstracts  
							- Compresses CPC codes (e.g., `H01M10` → `[H01M10_token]`)  
							- [tokenization_4_fine_tuning_2.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/tokenization_4_fine_tuning_2.ipynb): 
							- Tokenize abstracts with `BertTokenizerFast`
							- Save the dataset into manageable chunks
							- Ensure the `TRAIN_MODE` flag is set to `True`

## 4. 🧪 Fine-Tune the Model
							
							The fine-tuning process lies at the heart of our approach to developing a classifier capable of identifying the **specific functional roles of Critical Raw Materials (CRMs)** in patented inventions. 
							These functional roles include: “use”, “recycle”, “refine”, “remove”, and “wrong” (irrelevant mentions). 
							Our methodology consists of two sequential training stages:
							
							### 🔁 Stage 1 – Domain-Specific Pre-Training 
							
							<pre class="python" ><code data-trim data-noescape>
								python scripts/fine_tuning_1.py
							</code></pre>
							
							In this stage (see script [here](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/scripts/fine_tuning_1.py)), we train a masked language model (MLM) based on a pre-trained architecture (e.g., BERT for chemicals) using a large corpus of **unlabeled patent abstracts** from the years 2000–2023. 
							This step allows the model to specialize in the language and structure of patent texts, capturing domain-specific terminology and context relevant to technological and material innovation.
							
							### 🔄 Stage 2 – Supervised Fine-Tuning on Annotated CRM Data 
							
							<pre class="python" ><code data-trim data-noescape>
								python scripts/fine_tuning_2.py
							</code></pre>
							In the second stage, we load the domain-specific model from Stage 1 and fine-tune it on a **manually annotated dataset** of CRM-related patent abstracts. Each abstract is labeled with the functional role of the CRM it mentions. 
							This model does not rely only on the abstract content: it also incorporates **Cooperative Patent Classification (CPC) codes** as additional features to enrich the contextual understanding of the invention.

## 5. 🔎 Prediction
							### 5.1 Download Data for Inference
							Run the following shell [script](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/scripts/download_data_inference.sh) to download a set of unlabeled patents for which the model will predict the functional role of CRMs:
							
							<pre class="bash"><code data-trim data-noescape>
							bash download_data_inference.sh
							</code></pre>
							Places files in:  `../EPO_CodeFest_2025/data/`.
							
							### 5.2  Tokenize Inference Data
							To prepare the new, unlabeled data for prediction, you need to re-run the tokenization notebook used during training:
							
							1. Open the notebook: [tokenization_4_fine_tuning_2.ipynb](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/notebooks/tokenization_4_fine_tuning_2.ipynb)
							2. **Set the following flag to inference mode:**
							
							<pre class="python" ><code data-trim data-noescape>
							TRAIN_MODE = False  # Important: switch to inference mode
							</code></pre>
							
		
							### 5.3 Run Predictions
							Execute the [script](https://github.com/epo/CFS25-Material-Decoders/blob/denise/Source%20Code/scripts/predict_dataset_labels.py) below to predict the functional roles of the CRMs:
							<pre class="bash"><code data-trim data-noescape>
							python scripts/predict_dataset_labels.py
							</code></pre>