Claude Code for Data Analysis – Vincent Codes Finance

Claude Code is a powerful new way to work with coding agents. In this post, I’ll walk through how to get started with Claude Code for data analysis, the workflow I recommend, and some of the more advanced features you might want to explore once you’re comfortable.

What is Claude Code?

Claude Code is a command-line interface (CLI) tool designed for agentic coding. Instead of just giving you autocomplete suggestions like early versions of GitHub Copilot, Claude Code allows you to interact with an AI model in a structured, project-aware way.

While tools like GitHub Copilot and Cursor also offer “agentic modes,” in practice Claude Code has consistently given me the best results and smoothest experience when using coding agents. It’s designed not just to generate code, but to collaborate with you across an entire workflow: planning, executing, and reviewing tasks with awareness of your project context.

Another practical benefit is pricing. Claude Code isn’t free, but you don’t need a special expensive developer plan to try it out. If you already have a Claude Pro subscription (the entry-level paid plan for Claude), you can start using Claude Code right away for casual coding or data analysis—no extra cost.

Note that Claude Code is not the only agentic coding tool available. Other good options include OpenAI’s Codex CLI, Google’s Gemini CLI, Cursor CLI, and open-source alternatives such as Open Code. Each has its strengths and weaknesses, but Claude Code is my personal favorite for data analysis (for now!)

Video tutorial

This post is also available as a video tutorial on YouTube.

Getting Started with Claude Code

Claude Code runs as a command-line interface (CLI), so setup is done from your terminal. Once installed, you can use it directly in your projects or alongside your existing workflow tools like Visual Studio Code and GitHub.

Installing Claude Code

Before installing Claude Code, make sure you have Node.js installed on your computer:

You can download it directly from the official Node website [Node.js link], or
If you’re on macOS and use Homebrew, run:
```
brew install node
```

Once Node is installed, you can install Claude Code globally with:

npm install -g @anthropic-ai/claude-code

After installation, you’ll have access to the claude command in your terminal.

Two optional but very convenient tools to add at this point:

*Visual Studio Code and JetBrains Extensions** – lets you see Claude’s updates and outputs directly in your IDE. [Claude Code VS Code extension link]
GitHub CLI (gh) – this allows Claude Code to interact with GitHub repositories. On macOS with Homebrew, install it with:
```
brew install gh
```

Initializing a Project

In my typical workflow, I like to set up the project repository first, before asking Claude to do anything. This way, Claude Code starts with a clean, structured environment that reflects how I want the project organized.

1. Create the repository and environment If I’m working in Python, I usually initialize the project with uv a fast package/dependency manager:

uv init my-project
cd my-project

Then I’ll add the core dependencies I know I’ll need (e.g. pandas, numpy, matplotlib). Claude can install more later if required, but giving it a solid baseline helps it recognize what’s already available. Note that Claude will usually try to install packages by editing pyproject.toml directly. I prefer to instruct it to use uv add <package> commands instead, so the latest versions are always installed.

2. Define a project structure

At this stage, I’ll set up directories for different types of files. For example:

src/         # source code  
data/        # raw datasets  
figures/     # plots and charts  
tables/      # analysis outputs

Claude Code can “see” this structure and will naturally start saving outputs in the right places.

3. Add a short project description I usually write a short markdown file describing what I’m doing—either a README.md (if the project will be hosted on GitHub) or a simple notes.md. This provides Claude with immediate context about the project’s purpose.

4. Initialize Claude in the project

Once the project is set up, it’s time to bring Claude Code into the picture. You can eitheir run claude in the terminal or use the VS Code extension.

The first time you launch, Claude will ask you to either sign in (if you’re on a Claude Pro or Max subscription) or provide an API key (for pay-as-you-go).

With Claude Pro, you’ll automatically use the Sonnet model, which is more than sufficient for casual coding and data analysis.
On Claude Max, you can also select the model you want with the /model command.

Once logged in, run the /init command inside Claude. This generates a CLAUDE.md file, which contains a set of instructions for your project. Claude builds this file by looking at your project’s structure and any descriptions you’ve provided.

5. Review the CLAUDE.md file

It’s good practice to open the CLAUDE.md right after it’s created. This file is included in every prompt Claude sends while working on your project, so you want to be sure it reflects the right context and conventions, and that any special instructions are clear.

Making Your Data Accessible

For data analysis projects, one of the most important steps is making sure Claude Code knows where your data is and how to access it. Depending on your workflow, there are several approaches:

1. Keep data directly in the project folder

If your dataset is small, the simplest option is to drop it into a dedicated folder inside your project (e.g. data/). Claude Code will automatically see the files in this directory and can use them in analysis.

2. Reference external data locations

If your data lives outside the project folder—say, in a Dropbox directory you share with collaborators—Claude won’t automatically know where to find it. You have two good options here:

Reference the path in CLAUDE.md – add a note in your project description telling Claude where to look for the dataset.
Use environment variables – if collaborators have different setups, it’s cleaner to define the data path in a .env file (e.g. DATA_PATH=/Users/yourname/Dropbox/finance-data). Then, in your CLAUDE.md, simply tell Claude to read the path from the .env file. In your code, you can use the python-dotenv package to load the variables from .env, and os.getenv("DATA_PATH") to access them.

3. Fetch data programmatically For datasets stored online (e.g. academic repositories, APIs, or financial data feeds), you can also have Claude write the code to fetch the data directly as part of the workflow. This is especially useful when you want reproducibility.

Have Claude describe the data

Before you start the analysis, a good idea is to have Claude read part of the data to infer its schema. This way, it will understand what’s in your dataset.

Have Claude write this information to a Markdown file so you can review it and make any required corrections. You can then direct Claude to look at this file for reference. This ensures that Claude will always have a clear idea of what each of these columns means.

Extending Claude with MCPs

Claude Code is only as effective as the information it has access to. By default, it relies on its training data and the context you provide in your project. But as with any LLM, relying solely on training knowledge has two big limitations:

Knowledge cutoff: the model will not know about updates or changes that happened after its training point.
Hallucination risk: without reliable references, it may generate outdated or incorrect code. Claude is pretty good at recovering from mistakes, but they waste time and tokens so it’s always better to avoid them in the first place.

This is where Model Context Protocols (MCPs) come in. MCPs are connectors that allow Claude to query external tools, APIs, or datasets, enriching its context with up-to-date, authoritative information.

One MCP I consider essential is context7. This tool gives Claude access to package documentation on demand. For example:

If Claude needs to use pandas, it can pull the latest pandas documentation instead of relying on memory from an older version. It can even query specific functions or classes to ensure correct usage.
If you are running a regression in statsmodels, Claude can check the official API docs to confirm argument names and recommended usage.

In practice, Context-7 makes a big difference because the agent can rely on current documentation instead of outdated training knowledge, which means you get to the right results more quickly.

Installing and Using context7

You can install context7 with the following command in the terminal.:

claude mcp add --transport http context7 https://mcp.context7.com/mcp

If you have an instance of Claude Code running already (i.e. if the VS Code extension is opened), you may need to restart it for the MCP to be recognized. Once installed, you can make Claude use it by simply adding a note in your prompt, for example:

Please use context7 to get the latest documentation before writing the code.

Plan, Review, Execute, Review

One of the biggest advantages of Claude Code compared to autocomplete-style tools is that it can work with you in a structured, iterative way. To get the most out of it, it helps to think of your interaction with Claude as a collaborative workflow rather than a one-shot prompt.

The way I like to think of Claude Code is as a research assistant who works hard but needs guidance to stay on track. The best results come from a cycle of:

Step 1: Plan

Start by giving Claude a clear, specific task with as much detail as you can provide. Then, ask Claude to outline a plan of how it intends to complete the task. This gives you visibility into its approach before any code is written.

For a more complicated task, it is a good idea to have Claude write a to-do list of the plan as a markdown file. This way, it can keep track of its progress. If you have to stop the task in the middle, you will always know where you are and what’s left to do.

Step 2: Review the Plan

Go through Claude’s proposed steps and make adjustments if needed. This is very important because everything that follows will be based on this plan.

Step 3: Execute Step by Step

Ask Claude to execute the plan one step at a time. This way, you can monitor progress, catch mistakes early, and keep the workflow under control. If you see something that doesn’t look right, you can pause and correct it before it goes too far using the escape key.

Step 4: Review and Adjust

Once the task is complete, review the results with Claude. If something looks off, provide feedback and ask it to refine the code or analysis.

In practice, this cycle—plan, review, execute, review—turns Claude into a true coding partner/research assistant.

Just like any human collaborator, Claude benefits from clear instructions, regular check-ins, and constructive feedback. And just like any research assistant, it is your responsibility to ensure the final results are accurate and meaningful.

Analysis, Figures, and Jupyter Notebooks

Once Claude Code begins working through your tasks, you can treat it like a research assistant who generates both results and visual outputs. The key is to ask explicitly for the kinds of outputs you want at each stage.

Summary Statistics and Results

It is always helpful to request summary statistics and intermediate results before diving into more advanced analysis. I also like to ask Claude to comment on these results.

This has two benefits:

Claude can flag cases where numbers seem inconsistent or unexpected, which can save time.
Sometimes Claude will even realize that results don’t make sense and adjust its own analysis automatically.

That said, I do not rely on Claude’s interpretation. As the domain expert, I am the one responsible for evaluating whether results are meaningful.

Figures and Visualizations

Figures are an essential part of understanding data, and Claude can generate them directly in your workflow. A useful practice is to:

Ask Claude to save plots into a dedicated figures/ directory.
Provide detailed instructions on formatting, labeling, and what you expect to see.
Let Claude inspect its own output.

Since Claude is multimodal, it can actually look at the figures it generates and check whether they match your requirements. If something is missing, for example a mislabeled axis or an incorrect chart type, Claude often corrects it automatically.

That said, you should still review the figures yourself. Most of the time Claude gets close to what you want, but your detailed feedback ensures the final results are presentation-ready.

Jupyter Notebooks

Claude Code can also generate Jupyter Notebooks that capture the workflow in a transparent, reproducible format. This has two main advantages:

You can open the notebook and run the code directly, verifying outputs line by line.
You get a structured record of the analysis steps, which makes it easier to tweak or extend the work later.

It is best practice to ask Claude to save notebooks in a dedicated folder (e.g. notebooks/). And if you have existing notebooks, you can ask Claude to edit or extend them as part of the workflow. It can read notebooks directly, including looking at figures and outputs.

What to Explore Next

Once you are comfortable using Claude Code for basic analysis, there are some more advanced features worth keeping in mind.

Managing Context

Claude relies heavily on context to perform well. This includes:

Your project files and structure
The CLAUDE.md file you defined earlier
Any MCPs you have enabled
The outputs it has generated so far
The running history of your conversation

More context generally means better results, but there are trade-offs:

Quota usage: larger contexts consume more tokens, so you may hit your plan limits faster.
Noise: as the session history grows, older details may crowd the context and reduce clarity for current tasks.

Claude does will automatically “compact” the context, i.e. condense it into a smaller summary, when it gets too large. But there is always information loss when this happens, and in my experience the performance degrades noticeably right after a compaction, especially if it happens in the middle of a complex task.

There are a few strategies to manage this:

Use /clear to reset context when you are done with a line of work.
Use /compact to manually trigger compaction at logical transition points.

Sub-agents

Another advanced feature is the ability to define sub-agents. These are specialized agents you configure for particular types of work, such as generating figures, running statistical analyses, or cleaning datasets.

Each sub-agent comes with its own set of instructions and only receives the context it needs for its specific task. The main agent can then call the sub-agent when appropriate. This has three main benefits:

Tasks run faster because the context is smaller.
You save on quota since fewer tokens are consumed running the sub-agent.
The context of the main agents stays smaller for longer as it only includes the final outputs from sub-agents, not all the steps they took.

Sub-agents require a bit more setup, but they are a powerful way to keep Claude focused and efficient in larger projects.

Wrapping Up

Claude Code is more than just a coding assistant—it is a full workflow partner that can plan, execute, review, and document your analysis. With tools like MCPs, multimodal feedback, and context management, it can streamline even complex projects.

If you are just starting out, focus on the basics: setting up your project, making your data accessible, and working in an iterative plan-review-execute loop. As you grow more comfortable, you can explore context management and sub-agents to push your workflow further.

Reuse

CC BY-NC-SA 4.0