Video: Claude Code for Semiconductor Teams: Live AMA with Anthropic’s Applied AI Team | Duration: 2936s | Summary: Claude Code for Semiconductor Teams: Live AMA with Anthropic’s Applied AI Team | Chapters: Welcome and Introduction (6.4s), Anthropic for Semiconductors (125.935s), Claude.md File Explained (272.715s), CloudMD Governance Practices (380.75s), CloudMD for Hardware (583.985s), Parallel Agent Clusters (647.8s), Sub-Agent Management Strategies (820.9s), Agent Teams Explained (971.715s), React Simon Game (1091.545s), Parallel Agent Implementation (1478.17s), Agent and Skill Distinctions (1929.655s), EDA Knowledge Base (2295.535s), AI in Circuit Design (2377.58s), Managing Context Windows (2507.14s), AI in Semiconductors (2565.83s), Future of LLMs (2747.65s)
Transcript for "Claude Code for Semiconductor Teams: Live AMA with Anthropic’s Applied AI Team": Alright. Good morning, everyone. Good afternoon. Really excited today to bring to you, Claude Code for semiconductor teams. A a first of its kind, webinar that we're doing specifically for the semiconductor space, a space where we've been, putting a lot of investment, in product work previously. So excited to bring that to this audience today. Just some quick housekeeping items. A recording of this session will be distributed, via email within twenty four hours. You could commit submit questions at any time. You use the questions widget in the webinar portal, and we'll get to those at the end. And feel free to give us feedback too. We'd love for you guys to rate the webinar by selecting the survey, widget at any point. Just a quick introduction to our team. I'm Jeff Garcia, strategic account executive here at Anthropic. Work with, many of our semiconductor customers, here in Silicon Valley. I'm based in the Bay Area. Eric Burns, our, our our field CTO from the Applied AI organization is here as well. And then Mahima, Mahima Rupakula from our Applied AI team, specifically an Applied AI architect who's worked in the hardware and semiconductor spaces over the last several months with Anthropic customers. Quick agenda. We'll start with with AI for semiconductor teams, a good overview of what the, capabilities look like. We'll get into Claude code thereafter. Claude Code has been a a breakout hit over the last year. So much of the value that our semiconductor customers are getting from Claude is is with Claude Code. So we'll spend plenty of time there. We could do a live, live question and answer toward the end. We'll we'll take, some of your questions that have come across with Eric and Mahima. And then, briefly at the end, we could touch on how to get started and some next steps after today. What we found over the last year is that Anthropic is an especially good fit for semiconductors, for chip design, for verification for a few reasons. One is that it it it speaks the language. So Cloud has been shooting up big time on, with recent model releases on chip design benchmarks. Opus 4.5 and Opus 4.6 specifically with node fine tuning saw big, big jumps in its ability to generate RTL, largest of of any model provider, really as a result of improvements to the harness built around it versus changes to the training data or any kind of fine tuning. It has a context window that can process massive design specs and datasets. Third, we'll never train on on your data. That's a contractual guarantee that we provide. We also have secure deployment options through, AWS Bedrock, Google Cloud Vertex, and and soon Microsoft Azure. And then fourth, cloud reasons intuitively. It could do so on, root cause analysis, yield optimization, verification. The problems where brute force pattern matching is not enough, Claude has the reasoning capabilities to take those problems on. And so without further ado, I'll I'll go ahead and and hand the stage over to Eric and and Mahima to get started with the presentation. Thank you, Jeff. Well, as Jeff said, I'm joined here by my colleague, Mahima, from our applied AI team. And, we're gonna walk you through, kind of a high level overview of a couple of new and exciting features in Cloud Code, and then we'll do a software demo. We're gonna do it live, so be be kind. And then we'll take everybody's questions and hopefully talk about some of the special cases for the semiconductors industry. And so with that, we'll get going. One of the so, hopefully, everybody has had a chance to use Claude Code or at least a coding agent at some point. And, you know, one of the first experiences that you have when working with a coding agent is a desire to not have to reexplain yourself every time about what's going on, to guide it, to write code in your repo the way that you wanna write code, to tell it what's happening in there, and kinda give it a sense of of things that may not be intuitive or that might only be discoverable through crawling a lot of code. And then finally, as your as your repo grows up and as your use of agents becomes more complex, how to govern this and how to specialize based on different functional teams. So, we've created, not surprisingly, a markdown file, that is a special file for Claude, which is the CLAUDE.md file. And, this is just markdown. It's natural language. But the best way to think of this is that this is the, the jumping off point for Claude in interpreting everything that's happening in a particular repo or a part of a repo. So you can think of it like a read me that it is definitely going to read before it begins doing anything in the the repository, certainly writing any code, but even just, you know, scanning it and trying to understand patterns. And the exciting thing about CLAUDE.md is that it is a nested hierarchical approach. And so the very first step in getting a code base to be kind of ready for working with Claude is to go through kind of a an exploration mode, to try to just map out the state of the code base. Of course, you know, the user can tell Claude, what's most important about it, where to look first. But the the first step is kind of, you know, getting getting our feet under us. And so at the end of that process of walking through the code base well enough to begin coming up with a plan or, you know, writing a feature, or even just building a test, the culmination of that is Claude's ability to save a CLAUDE.md. And, critically, this isn't something that humans should be maintaining by hand. It's something that Claude is able to output on demand using its memory capability. And the best way to think of the the CLAUDE.md output is that, in theory, it should represent the smallest number of tokens that can consistently describe for Claude Code how to interact with a particular code base. So once you've established the baseline CLAUDE.md for your mono repo, you can then move into sub packages or subcomponents and begin to specialize from there. For example, there might be a subcomponent that uses persistently uses out of date libraries for some, you know, complicated reason that prevents them from updating. So rather than Claude, you know, automatically trying to update if it needs to overcome some bug, it could listen to the CLAUDE.md and say, well, for this entire code base, we're using, you know, React version x. But for this section here, we're using a a older version of React, and we'll update it later. So this is a combination of the union of different CLAUDE.md instructions, but also the ability to override as you get on the tree and increasingly specialize. And so lot of the questions that I get from, from the semiconductor industry, from, you know, large established, enterprise technology companies really boils down to how do we govern this if Claude Code begins to become successful? And so this is one of the core, kind of the the fundamental, capabilities and best practices in order to map Claude Code to coding the way that you want, to instruct it in the use and maintenance of your repo, and to encode the way that we do it here, or, you know, anything special that somebody should know before doing implementation or, any type of coding in the in the code base, and just encode this throughout the tree. And so, this is sort of the number one best practice for getting governance control, for being able to overlay your policies and steer cloud in the direction that you want and, hopefully deter it from making kind of, you know, mistakes that you'd have if if you go from first principles in a code base. If you have the code, but you don't have the tribal knowledge and the instructions and the specialization. And so there are a few best practices that tend to make things a little bit more effective in CLAUDE.md. Again, it's it's natural language, so, anything that expresses the idea should work. But to make it work better, try to keep the file short and keep, you know, pushing Claude to to keep it terse. And then, wherever possible, separate things that are fundamental rules. Like, for example, we always use camel case casing convention, or, you know, we always use underscores to denote, you know, private members. These are things that you would encode into rules so that Cloud absolutely always does them as opposed to being information about the code base or sort of understanding that you keep in CLAUDE.md. And then finally, if if the certain parts of the the CLAUDE.md stack are not relevant to this current project, you can do just an outright knockout. You can say, don't don't use you know, specifically exclude these CLAUDE.md files if you find that you're colliding or or sort of fighting in some way. And so this relatively lightweight but extremely important best practice overall, just seeding your code base with CLAUDE.md files and, you know, migrating things out to rules as needed. This is one of the number one ways that you can steer Claude Code towards working with your code base effectively, and you can kind of gain a level of governance control that, you know, if you need things to happen a certain way, if you need things to be triggered from other things, ideally formalize it in code, but, Claude rules and the MD stack should be pretty effective. Mahima, anything that you'd add on this? Anything that I haven't covered on CLAUDE.md? Yeah. I was actually just about to jump in, so great timing. I think that hardware code bases, especially if you're working in this field, tend to be very, very, very large. So I think that taking the approach of smaller, scoped to a specific module, scoped to a specific project, and then you making use of the rules the rules pads and organizing those instructions are gonna be really helpful when you're working with very large hardware code bases. So please keep that in mind. Claude Code is able to traverse a gigantic hardware code base if you make use of this nested sort of CLAUDE.md structure. Thanks. And, do you happen to remember the command, for mapping out a code base? Is it learn or explore? I couldn't remember off top of my head. We'll come back around to it. Yeah. Sorry. I was like, I usually use init, and then it kind of starts traversing. But I think it might be explore. wanna say explore. We'll we'll we'll find out and get a definitive answer before the time is up here. Mhmm. But now we can move on to a really exciting development that I think has kind of, you know, been in the mail to some degree ever since we created Cloud Code, but I think is is really finally becoming a just a terrific first class feature and something that, you know, significantly improves the capabilities of Cloud Code, which is to run agent clusters or agent fleets if you prefer, but essentially to decompose problems and spin up parallel agents. In some cases, parallel agents with, different charters, different objectives, and to have them work together, and in some cases, you know, maybe even touch the same files in different work trees. But, essentially, to break out of the single thread model of just kind of, you know, an agent doing work and the user responding, it's now possible to have Claude take on much, much more complex tasks by decomposing them and then standing up a group of parallel agents. So the way this works is that, at a at sort of the most naive level, you can run parallel cloud by simply having many different cloud windows open. And, this tends to be sort of the the first rung on the ladder after people have developed the comfort with using coding agents. There's a great article for those that wanna track it down by Steve Yeghe that lays out this kind of eight point maturity curve for developers. And the first one is that you're using agent install. And then beyond that, that the agent starts to take up kind of increasing real estate on your screen. Eventually, you you move into a mode of trusting the agent, letting it do a lot of the code implementation, and kind of, you know, verifying at a high level. Your coding experience becomes sort of agent first, and then over time, you begin to activate multiple agents simultaneously on different projects, different parts of the code base, and kind of, you know, jump between them. And then at the the most you know, the higher level of maturities in the stack, you're essentially letting the agents rip on very large complex problems, and they the agents themselves are spawning agent fleets, and they're orchestrating the sub agents and doing their own parallelization and and task decomposition. So if you're just doing it kind of the the naive way, just launching a lot of different parallel clouds, one of the problems that you can get into is that two tasks in the same part of the code base, might depend on touching the same file at the same time. And so, this would obviously be a huge no no if you if you have, colliding changes from different instances of Cloud that don't know what each other are doing. That could lead you into some real trouble. And so Git has a great feature, called, worktrees that we're able to use directly through Cloud. And so what we're basically doing is creating, isolated versions of the same repo, and then we can operate, we can run Cloud in each of the different sub worktrees. And then, eventually, we can get our conflicts merged and get to the point where we have a single check-in that we can stage. So this is one way to do parallel cloud kind of get the the throughput boost. We'll show another one in the demo in just a moment, which is managing the sub agents directly. And so now that we're talking about sub agents, it's possible to both prescriptively prescriptively define the sub agents in advance and actually build them into your project, you know, with definitions and markdown files. But it's also possible at runtime to just tell Claude what kind of agent you wanna create if you feel that you have a strong instinct for what's needed in that particular moment. In general, Claude's pretty good at figuring out what kind of agents it needs and prompting its sub agents to do the right thing. But, again, you can you can override this very explicitly by building agents. You can override it less explicitly by describing the type of agents you want, you can kinda just let Claude do its thing, which honestly is what I tend to prefer to do because, it's usually very smart. And so there are a couple of reasons why we do this. One of them is just the basics that you can give the the agents specific roles, and they can act in those roles. This is especially important if you want one agent to be a test agent or to work adversarially against another agent in the system and to basically make the system better by, you know, trying to break it in some way or trying to evaluate a red team in some way. The second is that this is a great way to kind of get the get the work sort of encapsulated in the sub agent so that you don't have to be constantly monitoring it, and you can have just kind of a long running task that it can go off and and run on. And then the third one is is kind of one of the basics of working with coding agents, which is your context window. This is the the number of tokens that you can have in the conversation before we have to do a compaction step or somehow, you know, reboot or recycle the conversation. That is that is kind of your the the sort of scarce life force of working with coding agents. And so it's good to be very disciplined with how much you introduce into that main agent's context window. And one of the great things about working with sub agents is that you can push lots of less valuable context down into the sub agent's context window, and then they can kind of bubble up to the top agent with a much lower a much lower impact set of tokens that describe, you know, the results of the subagent, the next steps the the orchestrator agent should take, and just in general, be extremely efficient in managing the context of the all important orchestrator context window. So one of the the obvious use cases for this is anything that is an embarrassingly parallel decomposition. So, you know, spawning sub agents to go look in different parts of a repo and, you know, build documentation, doing, you know, a framework migration across a big chunk of a code base where repos are essentially isolated from each other, but, you know, maybe share component libraries. And, you know, kind of anything where the task naturally decomposes in a way where the agents would not be expected to be touching the same file at the same time and, you know, meaningfully changing it, in in a way they'd step on each other's toes. And, finally, it's possible to manage teams directly through Claude Code, and this is what we'll see in the demo in just a moment. You can stand up sub agents. They can communicate with each other, and they can communicate through the orchestrator, and they'll kind of come up for air. They'll they'll check-in with the orchestrator when they finish a task, when they need user input, when they're surprised in some way, when something fails. But, you know, this is again, as I mentioned before, this is something that really shines when you have a problem that can be broken down into nonoverlapping work streams. So if you have overlapping work streams, use worktrees. If they're not overlapping, if it's just a natural problem decomposition, just stand up an agent team and let Claude, kind of go and do its thing. So that's sort of the rundown. Parallel Claude's are great for unrelated tasks, each one in a work tree or a terminal. Sub agents are really good for kind of, you know, pure delegation. Like, you wanna have a test agent go off and test a big part of the code base. And then agent teams are for practically decomposable large problems with kind of multiple sub threads that don't require direct interaction with each other. And so with that, I think we may be in a position to do a little bit of demo time. Alright. I'm gonna stop the share here, and we'll go over to sharing my screen. And, let's live dangerously. Alright. And speaking of living dangerously, we're gonna share the whole screen here. Alright. Mahima, is this coming through okay? Can you see the screen? Yes. Excellent. Alright. So, what we're gonna try to do here is, build a version of the Simon memory game in React. And the neat thing about this project is that it has a natural decomposition for parallel parallel implementations. So at the highest level, there's a React component, as you can see here in the prompt, that manages the game, manages the human interface layer, visualizes the game state, accepts feedback from the user, and, generally, it should have a pretty loose coupling with the back end of the system. This is a memory game that uses audio in addition to video, and so we need an audio layer, and we need to be able to trigger the right sounds, and all of that implementation should be local to the the audio component. And then finally, we need something that actually runs the game. And the game is independent of the audio renderer and the, you know, kind of the the view visualization layer. And so that's a self contained component that we can build and test on its own. And so what I've done is I've asked Claude to build for itself a plan to do a parallelized implementation of the entire game. And, we've already built this once, so, you know, in the worst case, we can, go back and see the end product. But I have a, a clone of this repo back from the point where I had only built the plan. I hadn't implemented anything, and it's ready for Claude to implement. So I asked Claude, can you summarize the plan and just get ready to go? And so Claude, one of the the exciting things about, especially working with Claude Opus four five, sorry, four six is it just really, really, really wants to code. And so when you, let it off the chain, it goes and and, does some exciting stuff. So all we have to say now is go, and Claude is off and running. So the first thing that it's gonna do is it's gonna build, the kind of outer layer, the the sort of scaffolding of this app, and then it's going to define the the subtasks that it's gonna going to deliver to the parallel agents to go and implement. And so, we'll let it do its thing, sort of thinking about what it's gonna do. And now and this is critical. It says, I'm launching three parallel agents to build the engine, the audio, and the UI components just as we've instructed here. And in a moment, we're gonna see a pretty cool new feature of the Claude Code UI, which is that you can directly manage the parallelization of these agents by navigating through the UI and, if necessary, deep diving into what each of the agents are doing. But for now, we're gonna let Claude Code continue and, hopefully, show show us that it's gonna be orchestrating these parallel agents. And while we're waiting for this to stand up, does anybody have any questions that we can answer, as we're we're sort of, getting the, the demo script going here? There was a question regarding, Mahima, part of your opener. Do you mind maybe repeating the, slash command that you use at the beginning of each quad code session, Yeah. Yeah. while you explore all the quad entity files? Of course. That was slash /init. So if when you run slash init, that's kind of the cursory overview of a code base. Claude will step through it agentically and then build a prime a a preliminary CLAUDE.md file from scratch. And then if you wanted to go into it further, Discover is going to explore Claude Code features that you're going to be using in order to track your progress. And then I'm still looking for what the explore slash learn command is, but init is how you would initialize Claude Code in your code base and would go search and do the initial learning. Great. We've got one more here. Question is we have a lot of documentation for our modules, but they're being kept in they're they're being kept in on an internal server that we have access to. It's accessible in HTML. Is there a way to link specific documents in the Claude and MD files for Claude to look at when working with specific parts of the code base? This would be a great use case for an MCP server that you could actually use to connect to your, your systems. And, while we're waiting for Claude, would you be willing to recap, Mahima, what an MCP server is and how you get one? Yeah. Absolutely. So MCP servers are essentially very similar to if you're familiar with other protocols such as REST or LSP. They're essentially plug in points for your for the model, so Claude, to be able to interact with different components that are there for that are that have been exposed from a third party. So this third party can be a third party service such as, Google Drive or Salesforce or Databricks, something like that, where those third party vendors are exposing different tools for Claude to use or different components for Cloud to use, or it can be something that you develop yourself. So in this example, you want a connection point into, an external place where your docs are being stored. You can develop an MCP server that will be that connection point to access your data. MCP servers are made up of three basic primitives, and there's a lot of information online about how to build them. And Claude Code itself is actually very good at building those MCP servers. Those three basic primitives are model based, application based, and user based. And when you build around those three primitives, tools, workflows or two tools, workflows, and prompts, you will be able to essentially provide the model three different ways to interact with either model specific information, application specific information, or user specific information, and you can open that connection point into your third your external your internal systems. Sorry. Alright. Thank you very much. So I think what we're gonna do here for the demo is, go back over to the run where Claude was able to complete it, and, hopefully, we'll have something more exciting to check out over here once, once we start getting tokens from the model. So, let's let's walk through what Claude did. The first thing it did was it planned everything out. And, for those that haven't used it yet, slash plan or plan mode, through your favorite IDE is one of the best ways to get more out of Claude. It makes it really, really strong at comprehensively building an implementation plan and really thinking through it and, you know, if necessary, kind of second guessing itself and refining the plan. And then it presents you something that as the user, you can just choose to let it loose on. And so in this case, it planned out the implementation of the game engine audio and UI components. It did a degree of of prework. It mapped out a parallel implementation plan, and then I asked it to check it in so that we could do another instance of this. And so then finally, I gave it the same instruction. Go. Go, Claude, and build this. And so we kind of walked through, you know, overcoming a couple of errors along the way, which Claude code is great at doing. And then it started admitting code, lots and lots of code. And so as we move here, I, you know, gave it the instruction that I didn't need to, didn't need it to run NPM dev for me. It continued on. It, worked through implementation. And then interestingly, not only did it create these sub agents, but it also, based on my prompts. So back here at the beginning, I said, once we build these, please test each of them. Top level agent could do the integration and, go ahead and build the plan. And so now if we go back down to, where we're picking up here, the adversarial agent was launched once the implementation completed, and it continued to run after Claude returned control to me. So in the background, there was this adversarial agent running. It was actually implementing and executing tests. It found a couple of weird esoteric race conditions. But once we'd implemented the the base software, I said, hey. Please launch a background agent to assess componentization. And so, one, this is using a parallel agent. And two, it's giving a specific directive to that agent to kind of inhabit a persona, which is, a software engineering agent that assesses, code quality and encapsulation. And so, as it says here, two agents are now running in parallel, and so they'll report back, but, we've gotten control back in the main cloud cloud interface. So, over time, it managed to go through everything. It verified the program. It verified that all tests passed, which is pretty important. And so, in in essence, what we did here is a test driven and spec driven development approach to building this application. We defined what we wanted. We defined the, in effect, the test conditions we needed. The the agents built out those test conditions and the verification cases, and we were able to implement by sort of filling in the bodies of everything instead of just sort of building it on a blank canvas. And so let's check-in on our other implementation over here, and now, we can kick these off. And so we can see the agents beginning to start themselves up here. The workflow agents, one through three, they're in the process of kicking off, and then we'll see that new UI. So if I scroll down here, we've got one background workflow. And, background workflow right now is just starting the agents. But in a moment, those will stand up, and we'll be able to dig into them specifically. Alright. And while Claude is doing its thing there, let's go take a look at the end product that we created in the first run. And, not surprisingly, here we have, what looks to be an excellent implementation of the Simon game. And, just for fun, let's play it here. So there's our audio engine. Hopefully, that's coming through in the webcast. And let's make a mistake. There we are. Alright. That's a that's a suitable sound for making a mistake on Simon. I think this this, maps pretty well to my recollection of playing this game back when that was the kind of electronics we had. And finally, let's check-in on Claude here and see see how we're doing with kicking off these work streams. So now that it's been able to start these agents, there should be a lot of inputs coming in quickly because once you're in whoops. Alright. We're gonna skip that security check, and, hopefully, it can learn that it is not able to use, ESBuild. Let's just tell it so. Alright. So we've backed into interactive mode. Go away. And we'll just let it know that it can't use that. But in any case, I think the the general idea is coming through, which is you can create parallel agents by just prompting Claude. You can manage the parallel agents directly in the UI. The parallel agents improve the throughput of building complex systems in the code. And then as you stitch it all together, you can ask a test agent to both test the code and make sure that it functionally works and ask another agent to assess the overall factoring and code quality. So you can see there that we're touching just about every part of the overall software engineering workflow. Now this is a very, artificial test, but I think it's a great example of showing functional decomposition and how it maps parallel agents. And with that, I think we can go back over to questions. I'll stop sharing my screen. Okay. There was a quest from the audience. The quest the question was, regarding semiconductor design activities. Most of what we've shown here has been specific to software development, writing RTL code. The question is, what about connecting coding agents to existing ADA tools? Mahima, what do think about that one? Yeah. Sorry. I'm also typing the response to another question as well about RTL from spec in that workflow. So I'll send that in a sec as soon as I'm done answering, here. But, yes, it's actually a really common use case to either use Claude Code to analyze the results from different EDA tools or to connect agents to different e d l EDA tools to to analyze those results. That is something that is commonly used, and Claude is very much able to do this. In terms of how best to do that, it kind of depends on what type of results you're trying to analyze. Sometimes building with the sometimes just attaching clog just instantiating clog code and looking through the results is great. Sometimes we you need to personalize it a little bit. So what I've seen some people do is they'll point Claude code at the agent SDK, and they'll build a little agent that is specialized to maybe a specific tooling or result from one of the EDA tools. It it depends on reusability as well. If you think this is something you're going to keep wanting to run and reuse, then I guess it makes sense to invest that time into automating that process. But I think a great way to start is with just Claude Code initially. Excellent. So another question. Our our repo is a mix of RTL, UVM test benches, Python scripts. How do you keep Claude from getting confused about which conventions apply where? That's, again, going to be done a lot through how you define your CLAUDE.md files and the structure. Structure. In terms of the conventions that it's getting confused with, hopefully, it's not getting too confused. It should be able to understand RTL versus Python versus, versus what's in your test benches. I guess maybe it's just losing some of the context on, like, different different naming conventions or hierarchies that you have. So, again, that's where really well thought out CLAUDE.md files are going to be very important. Excellent. Another one, we we've had AI generated system Verilog that looks syntactically clean but failed synthesis or doesn't meet timing. Where is Claude actually on that today? Verification. needs 60 or 70% of our design cycle. Where can Claude help us move the needle? Yeah. Absolutely. I think that that has been kind of flagged as an area of improvement for Claude, but it should be it's again, it's 60 to 70% kind of benchmark right now. However, Claude is getting better at it every day. The best way for me to put that. Alright. How would you build the MD file to instruct on how to organize those dot MD files and use them? Yeah. There's a couple different ways that you can do this. What I've seen in the past is sometimes people use the core or the the the MD file at the the top of the hierarchy is almost like a tabular CLAUDE.md file, and then that tabular format is actually able to just keep track of all of the CLAUDE.md files in the code base. Eric, I don't know if you have any other suggestions for what people could do there. I mean, those those are the basic best practices. Just, Yeah. don't be too verbose. Definitely don't duplicate things unless you wanna just really reinforce it for Claude, but that's, probably less effective than rules. And, you know, I I personally I update CLAUDE.md whenever I make a change to a repo. So these are things that you should be you should feel empowered as the owner of a repo to rev often. Great. And then, just speaking more specifically about successful use cases for Claude in semiconductor flows, Mahima, would you say verification scripting, flow automation, documentation, something else when you think about strongest use cases today? I think, that RTL generation and test bench generation use cases are the strongest. We also have seen analysis of results from EDA tooling actually being very strong. However, sometimes it's testing it with Claude code and then actually building an agent that's a little bit more specific there. Also, Claude, we believe that in the future, Claude is going to be much better at RTL generation, that that that, that passes test budgets and passes the verification stage. So definitely building for that future is very important. Excellent. A more general question. How are sub agents different from an AI from a skill? Oh, do you wanna take that, Mahima, or should I? Oh, feel free. There there's a it's a great question. There's a really fundamental difference. So an agent has agency. It does things. Kind of the the simplest definition of an agent is that it's a large language model that can use tools so it can change the state of the world in some way. It's running in a loop, and it has a goal. And so that definitely applies to the sub agents that we created in the coding demo where, we gave each of them an objective, which was to implement and test their own component. And so the best way to think of a skill is it's all of the the definition of how to correctly perform a task. I like to say the way we do it here. So according to the rules of the organization, according to SOPs, and the skills should actually be relatively static. And so that's a way to kind of separate it's a separation of concerns thing where the frameworks that execute the skills, the harnesses that help the agents work in parallel, help them coordinate with each other, those are getting increasingly complex, and they're they're changing frequently. A skill that represents the way to interpret a document or the way to, you know, convert a template from one workflow phase to another one. These are things that are external to the system that does parallelization that manages the agents. And so you can think of a skill as an encapsulation or, like, a formalization of the correct way to do something given a set of tools to get a predictable outcome. You can think of the agent as the vehicle that executes that skill to deliver the outcome. Fantastic. Can we build code specific, to a specific target hardware platform? Yeah. Absolutely. I mean, that's I would say that's lived somewhere between CLAUDE.md rules and skills. But if if you, the developer, have strong knowledge of this platform, and there's a way to ideally give Claude Code examples of using the platform in the correct way, then there's no reason that you couldn't basically equip use use all of that existing code to equip Claude to write novel implementations based on the patterns that work already. And, Mahima, anything you'd add on that one? Nothing to add there. I was actually kind of responding to another question about about how to create knowledge bases for specific specificity around EDA tooling. So I can actually answer that out loud if you'd like. Yeah. So, totally understood. Vendor tool documentation isn't vendor tool documentation changes with versions, and PlotMD alone doesn't always stick. What you can actually do is build a EDA knowledge base for any kind of tooling that you have. So you could have, you could have in your dot file, you will have agents that are described as, like, a Cadence Genius, for example, or a or a synthesis agent or a simulation agent. You can define different things for runtime knobs, tool specific settings. And then you'll be able to create kind of, like, tool specific files in each of those agent. And when I say tool, I mean, EDA tools. EDA tool specific agent files in each of those, areas. And then what you'll be able to do is you can you can define things like, configuration points, compilation information, runtime environment constraints, or other information that Claude needs to have, when Claude should be using the agent, consolidated references, anything like that. And then you will be able to, like, basically create a little knowledge base for Claude to use any of those tools. Excellent. This is a great question. Can cloud assist in selecting circuit topologies? For example, OTA architectures, band gap references, LDO designs based on spec trade offs, or is it limited to well documented circuits in design spec? No. It can it can make those trade offs if you provided the right context, and I've seen people use it to make those trade offs. Great. How well is the quad model trained on system c design, if at all? Is that, something that you're aware of, Mahima? Claude has been trained on a variety of different coding languages all across all across the Internet and across all the different dollars that we provided. So it is good at a variety of different coding languages, embedded c, c plus plus, anything in that vein as well. I I would just add one thing that, if there were ever a you know, the the common, expression is stochastic parrot. If there were ever a stochastic parrot phase for these models, I think we've long since succeeded it in the sense that, the way and we've we've proven this through our interpretability research. The way these models understand code is that they basically convert it as they read it into a conceptual understanding, and that that language agnostic conceptual understanding can then be implemented in any target language. And so it's less about understanding a particular language at the language semantics level and more about understanding how computers work and how a particular technology domain works and being able to generalize that across any language. So in theory, you could give Claude a completely made up language as long as it was internally consistent and had code based examples, And it could admit that language to express the underlying programming concepts that it internalizes when it's which reviewing a piece of code. Fantastic. Any, any any general tips for, for managing the context window and avoiding surprises there? I mean, I'd say just push push every keep CLAUDE.md thin. Push everything you can into your sub agents when appropriate. And this is just a a personal experience tip, but preemptively compact when you hit a decent starting point. So, you know, save frequent checkpoints. Do do lots of staged micro commits and squash them later. And if you if if the UI says that you're getting close to the end of your context window and you're in a good place where you can kind of, take a break, take a snapshot, and and compact and create a new context window to work with, it's better to be in control of that than to have compaction happen midway through an operation. Yeah. And I can answer there's a question that I see about the most successful that we've seen for Claude in semiconductor flows. Claude is very good at, flow orchestration skits scripts, so we can write complex multi tool multi tool instructions that coordinate across a lot of different vendor tools. So it should be able to handle tool specific out output parsing, especially if you give it a lot of context on what it's supposed to know. It understands it it's very good at, like, log parsing and root cause analysis in general as well. So if you feed it a bunch of bugs bug information, software or hardware specific, it should it is very it is very helpful in root cause analysis there. It is actually pretty decent at generating RTL code. I think the compilation rate right now is a little bit lower than we'd like, but in the future, we 100% believe that it will be better, as the model gets smarter. And then what it does have a little what it does have a little bit of issue with right now that I've seen across different customers is PPA tasks, so performance power area optimization. But, again, over time, mod all types of models will get smarter and better at these types of tasks. I I just wanna emphatically second that because there is my own experience with it, it's intuition for taking a lot of raw log data and pinpointing some tricky race condition or some, you know, very, very complex edge case that's hard to build a repo for. These are these are things where I think AI is is just purely net additive to human intelligence. There may be things that we can intuit and find that, you know, it won't it won't work out on its own. But in practice, one of the most effective habits is just giving it a huge amount of raw stuff to slog through and say, what's going on here? And a shocking amount of the time, it will just zero in on the right problem and, you know, give you a very fast path to a fix. And, in even with with older models, I've had cases where I was trying to do, you know, something at kind of an embedded system level, and, I gave it just a binary dump of what I was able to read off of a pin. And. so I said, I've been listening to this pin at 12 megahertz serial at this voltage for the last five minutes. Claude, what is this? And it's been able to say things like, oh, this is Dolby a c three, and you can see the headers right here. Or, you know, this is a w S2812 b LED control protocol. Like, you can opt you know, obviously, do you want me to speak it? And so that that sense of things that would you would never attempt as a human, like, you know, manual review of a binary dump. These are things that, again, with the the kind of intuition and complexity of the model, it's it's unexpectedly good at. And I wanna double click on that binary dump use case specifically because that's what I've seen work so well for so many people. Yeah. Just give it just. give it the raw bits and let it figure. it out. Yeah. Alright. I think we've got a good one to to maybe to wrap up on here. And, Eric, take your time with this one. It's a little bit of a forward looking question. So question is semiconductor CAD flows often require highly complex orchestration across massive compute grids. With the evolution of tools like Cloud Agent SDK, how do you envision LLMs moving beyond code assist to autonomously manage and debug compute tasks or CI failures within highly proprietary environments? What a closer. Swinging for the fences. You know, I I think this is just based on the question, I think this is definitely coming. The the way that we tend to approach this stuff at Anthropic is to think of it in terms of optimizing the dev loops so that the agent can do things without human interaction. And so as you start to work towards automating things, making them run asynchronously, and then scaling up the execution of those things, the killer is not, ability to do something. It's needing a human to give input. And so step one is to kind of constrain the problem space until you don't need a human in the critical path if you're trying to do this at scale and, you know, maybe let something hill climb on trying to work its way through solving a problem. So first, make sure the human doesn't have to be in the loop and you've chosen the problem that has that that property. And second, optimize the heck out of that dev loop. So the product here is not the model and exactly how you're coupling it with the use case. It's giving the model the ability to iterate very quickly and efficiently and to do it in a way that it can run-in a, you know, totally open ended basis. There's a really great article that we published on our our research blog, which, I don't know the the exact title, but essentially boiled down to having Claude implement a clean room clone of GCC with only compiled binaries, and the original source code that led to those binaries as a way to, work its way towards solving the problem. And so we put it in a very simple loop. We just basically said, Claude, pick up the prompt in this file, which you can modify if you need to, and just keep working. When you run out of stuff to do, terminate, and then pick up the prompt and loop again. So in this incredibly dumb, straightforward, simple loop, we were able to let Claude run for weeks and spend $20,000 worth of tokens and then replicate one of the most complex pieces of software on Earth in a way that, if it doesn't work perfectly, it is worthless. And so I think that's a great proxy for the kind of problems that, you're likely to encounter as you build these sort of agent fleets. But the magic happens when you create that dev loop that can Excellent. Thank you, Eric. Thank you, Mahima, and thank you to our audience today for for, coming on. We truly appreciate your time. As mentioned earlier, there will be a a a webinar recording sent out within the next twenty four hours for those of you who may have missed part of the presentation today, and and, continue to appreciate your support of Claude and of Anthropic. Thank you so much. Thanks, everybody. Thanks, everyone.