FileGram: Grounding Agent Personalization in File-System Behavioral Traces

Abstract

Coworking AI agents operating within local file systems are rapidly emerging as a paradigm in human–AI interaction. Since users exhibit highly diverse workflows, personalization is essential for tight collaboration and a seamless user experience. However, effective personalization is limited by severe data constraints, since strict privacy barriers and the inherent difficulty of jointly collecting multimodal real-world traces preclude the creation of scalable training data and comprehensive evaluation suites. Consequently, existing methods remain interaction-centric and overlook dense behavioral cues embedded in file-level activities. To bridge this gap, we propose FileGram, a comprehensive framework that grounds agent memory and personalization in file-system behavioral traces. FileGram comprises three core components: (1) FileGramEngine, a scalable, persona-driven data engine that simulates realistic workflows; (2) FileGramBench, a diagnostic benchmark that treats file operations as behavioral engrams; (3) FileGramOS, a bottom-up memory architecture that builds user profiles directly from atomic file-level signals. Extensive experiments show that FileGramBench remains challenging for state-of-the-art memory systems, and demonstrate the effectiveness of FileGramEngine and FileGramOS.

Framework

Three components address data scarcity, evaluation gaps, and method limitations.

Data Generation

FileGramEngine

Scalable persona-driven simulation producing 640 controlled trajectories with ground-truth labels across 6 behavioral dimensions and 20 user profiles.

Evaluation

FileGramBench

First file-system memory benchmark. Four tracks: profile reconstruction, reasoning, anomaly detection, and multimodal visual grounding.

Method

FileGramOS

Bottom-up memory building user profiles from atomic file signals—procedural, semantic, and episodic channels—not dialogue summaries.

FileGramEngine

User Profiles

Tasks

640

Trajectories

Dimensions

FileGramEngine simulates realistic file-system workflows via persona-driven agents. Each profile is defined by six behavioral dimensions, producing fine-grained multimodal action sequences at scale.

~10K Output Files

PDF

3,609

Document

3,093

Markdown

2,310

Presentation

1,547

Image

1,031

Audio

516

Spreadsheet

516

20,028 Atomic Actions

File Read

4,541

Cross-File Ref

4,094

Context Switch

3,909

File Write

3,024

File Browse

1,649

File Edit

1,057

Dir Create

944

Others

810

FileGramBench

4,653 QA pairs across 4 evaluation tracks and 3 memory channels.

TRACK I

Understanding

Profile reconstruction & fingerprinting

886 questions

TRACK II

Reasoning

Pattern inference & disentanglement

1,694 questions

TRACK III

Detection

Anomaly & behavioral drift

1,103 questions

TRACK IV

Multimodal

Visual grounding from recordings

650 questions

Example questions from the four evaluation tracks, testing behavioral memory from procedural file operations to cross-modal visual reasoning.

FileGramOS

Stage 1

Engram Encoding

Per-trajectory atomic signal extraction

→

Stage 2

Consolidation

Cross-engram structured MemoryStore

→

Stage 3

Adaptive Retrieval

Query-time evidence composition

FileGramOS builds profiles from atomic file signals, preserving procedural, semantic, and episodic memory through a three-stage bottom-up pipeline.

Results

FileGramOS outperforms all baselines on FileGramBench.

SimpleMem

32.9

Mem0

33.2

MemOS

36.2

Zep

40.2

Naive RAG

40.5

MemU

44.4

MMA

44.7

Full Context

48.0

Eager Summ.

49.5

EverMemOS

49.9

VisRAG

51.9

FileGramOS

59.6

+7.7% over best baseline

Qualitative comparison. Left: A BehavFP question where FileGramOS's three-channel architecture jointly recovers the correct profile, while baselines each miss different signals. Right: A TraceDis question involving multimodal artifacts, where cross-format output gaps cause widespread failures.

BibTeX

@misc{liu2026filegramgroundingagentpersonalization,
      title={FileGram: Grounding Agent Personalization in File-System Behavioral Traces},
      author={Shuai Liu and Shulin Tian and Kairui Hu and Yuhao Dong and Zhe Yang and Bo Li and Jingkang Yang and Chen Change Loy and Ziwei Liu},
      year={2026},
      eprint={2604.04901},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2604.04901},
}