13:06:23 to talk about the structure function problem and microbial communities and say thanks for coming. I'm glad we got to do this in person, and amazing to be at a conference again. 13:08:21 Good. 13:08:23 Okay. Good. So let's start at the beginning. Let's take a step back and start with this question of, kind of what we're doing here. 13:08:34 So I think one of the striking features of sort of the long history of the biosphere, is the sort of manifestation of evolutionary dynamics over the last several billion years as manifested here by the measurements are influences of the partial pressure 13:08:47 of oxygen in the atmosphere. 13:08:50 So the point is simply that, you know, at some point in time back here photosynthesis was invented that metabolic process gave rise to oxygen in the atmosphere, which creates it creates Nisha is for things like respiration to evolve, and eventually hire 13:09:03 plants. The point is not you know the detailed wiggles of these curves which are, which are of course, interesting in their own right. 13:09:11 But the point is just that evolution began ecology began of evolution and so on. 13:09:21 I'm not saying we understand anything I'm simply making the statement that you know this is the process, under study, and the outcome of this process I think it's unassailable is set of stable nutrients cycles on a global scale. 13:09:37 So here's my two favorite examples. One is the nitrogen cycle which we'll talk about some today and the other is the carbon cycle, think it's sort of a it's worth just thinking for a moment about the fact that on this evolutionary timescale the system 13:09:48 self organized to continuously regenerate these nutrients on a global scale. To me this is a quite striking feature of the biosphere 13:10:05 quicker quicker. 13:10:06 Okay. So who's responsible for these fluxes of metabolites through the community through the biosphere, in large part it's microbes. So microbes are responsible for roughly half of all the carbon fixation on the planet, through the process of photosynthesis. 13:10:21 And this process would be nice vacation, which I'll talk about today is performed in very large part by bacteria and soils. 13:10:28 And so I would just like to argue that over this evolutionary process it's microbes resident in communities that are responsible for driving these metabolite fluxes through these, these sets of Redux reactions, getting too far from the receiver. 13:10:44 OK, so the Redux reactions and the metabolites that constitute those reactions are sort of the observable output of these communities. 13:10:52 And we've looked at this example of the microbial math that they have a key and others of course have worked on. And if you look at the mad, you know it's instantiated many of these transformations from nitrate to ammonia and so on. 13:11:06 But when you start of course digging into what the mat looks like it's it's it's a mess right it's a very complicated system there are these layers structures which seem regular, but it's not simple. 13:11:16 And if you look down another layer of organization. It's truly complicated. as Daniel evacuee filled in this paper. 13:11:24 So at one level at this very high level of organization, we have this flux of metabolites when you look at it it looks, you know, okay, there's relatively simple stuff going on here there's Redux driving this cycle of nutrients. 13:11:37 But when you look Microsoft microscopically at how those cycles of nutrients are being who's responsible for them. The structure of those communities is is kind of intense and complicated. 13:11:48 So for me, I think a lot of clarity came from articulating the question to myself with my collaborators. 13:11:55 You know what is the relationship between this level of organization where the system looks relatively complicated and this low level of organization where I might argue it's a little bit simpler although you know Terry and Diane showed us examples where 13:12:07 you know the electron donors and accelerators were anything. 13:12:11 So just just bear with me for indulge me. 13:12:15 So basically the question then is, you know, the question I'm driving towards is what is the relationship between these two levels of organization, the genomic level of organization where evolution is of course happening. 13:12:27 And then this level of organization where the metabolites are flowing through. 13:12:32 So there's, there's lots of work I'm just going to summarize a few papers that I think we're really influential in our thinking about this, and one is what is work of stillness Luca and Michael W and so on and their collaborators, where they basically 13:12:46 shown that if you look at, you know, replicate instantiate tons of similar communities in the natural environment in this case in a pitcher plant. 13:12:55 What you see is that, you know, they have roughly the same number of each metabolic Guild of bacteria, and these. They've spent a lot of time trying to classify those. 13:13:09 It's a you know debatable and complex process. But then if you look at who's present and each one of those skills it's highly variable. So there's nitrogen forest fires, but the guys doing nitrogen respiration in this pitcher Pitcher Plant are completely 13:13:19 different from the ones respiration. 13:13:22 the ones respiration. Okay, so this is sometimes called functional redundancy. 13:13:25 They've also shown that these functional guilds are easily are well predicted by the A biotic parameters in these large data sets, in this case that's our oceans data set for the sequence all these communities in the oceans, and then they also measure 13:13:38 The point is simply that the metabolic properties of who's there are sort of relatively conserved quantities in in natural environments, it would appear. 13:13:56 And then you go to the experiments that many of the people in this room are responsible for and you start to see what looks at least very coarsely like an analogous picture. 13:14:06 So here's autos experiments showing the colonization of these Titan particles and the idea is simply that, you know, you get this reproducible succession of bacteria, taking over these particles over a few days. 13:14:18 Okay. And Joshua's experiment which Terry discussed, and then some of my old work with Stan librarians act friends. Looking at the abundance dynamics and a synthetic community and the punch line is simply, you take the community you put it in a fixed 13:14:38 you control that environment, the abundance dynamics look relatively. What is the relationship between this and the metabolic activity of these communities in a while, they'll say it's, you know, not, not really all that clear. 13:14:46 But this over the last several years in particular, working with Mother of money and Carla Gowda and moody and other people that you'll need has sort of led me to this question which is, what is the mapping between the genetic structure of a community, 13:15:02 and the metabolic activity. There's some way to understand this relationship, can we can we make this question sharp, and then ultimately I think the question that as a community we should try to answer is, why did evolution get rough give rise to the 13:15:16 structure of the communities that we see. 13:15:20 So I feel that this question. I think once I got to it with Carla and was Madhavan was a movie. 13:15:27 I felt the clouds parting a little bit I think in some sense the question in ecology is often not quite clear. And so I feel that at least this question is sharp enough for me to go do an experiment. 13:15:38 It may not be the best question in the world but it's it's governed some clarity. Yes. 13:15:48 Do you mean evolution in the sense of temporal change without genetic change or do you mean evolution in the sense of genetic changes species learn to deal with each other, the ladder, I mean the ladder. 13:16:02 Okay. 13:16:06 So I'm gonna sort of present results, addressing this question in two systems, one in Dena education, and a second in materially closed microbial communities similar to those that are talked about at the last KGP. 13:16:23 And these are sort of attacking this problem at two levels of organization. 13:16:29 One is sort of a small and bottom up approach, where we're trying to understand this flux of metabolites from the composition of the community lab, and the other is a large end in quotes top down approach where we actually don't understand a whole lot 13:16:48 in detail about what's going on in the community, but we're going to let it self organize. Okay, let me just say a few words about the people involved. 13:16:52 So in the closed communities, there's been a close collaboration between Luis Miguel, and Cal moody who's sitting right here. And they're co authors on a paper that's being biased now. 13:17:03 And on this DNA replication project has been really owes its origins to catch up. I was cold called by Mother of money, who was trained by Boris Ryman as a developmental biologist sometime in 2015 or 2016. 13:17:16 And he said, I've got this amazing graduate student who was looking for a postdoc who's never done an experiment his life. and he's sitting right there, that's Carnegie. 13:17:24 And we started working together, and, you know, mother, and Karina sort of, and I had long discussions, especially over recordings of autos talks. 13:17:39 We washed out as talks we started to think about this question of relating structure to function. And then eventually we realized that the notification was a good system. 13:17:46 As Park I want to echo what Parker said sometimes having a naive but intelligent collaborator, who doesn't isn't you know captured by the biases of your field, but can ask you smart questions about what you're doing is really valuable valuable things 13:18:00 I was like a therapist holding up a mirror to. Honestly, 13:18:05 it's been amazing. 13:18:08 Okay, so here's the work that Carla, and mother did, I'm sorry I didn't mention Derek ping is an undergraduate who left for the greener pastures of high energy physics at Purdue, and Kyle's a postdoc, just joined and it's working out, and such and is 13:18:23 a physics from Urbana with me. 13:18:29 Okay, so here's the question that we're going to try to answer in the context of Dean education. 13:18:35 So literally, we want to know what the relationship is between the genes and the genomes of each strain in a community, and the ultimate flux of metabolites through that community. 13:18:47 The dynamic flux of metabolites that community. 13:18:50 So of course there's many layers of organization between these two are keen expression is happening, there's a logical interactions vironment might matter, so on. 13:18:59 But what we would like to do is pose this as a quantitative question which is can we make a prediction given knowledge about the genetic structure of each strain in the community can we make a prediction about the dynamic flux of metabolites through that 13:19:13 community. 13:19:15 What I want to do is start by giving you some background on what the natural vacation is and I think during that process, it'll become a little bit clearer why we chose it as a model system. 13:19:24 I'm going to use Alfred's Foreman's metabolism lectures, which I thought were absolutely amazing, and really illuminating for me. 13:19:31 Just to remind you that the next vacation is a process of anaerobic respiration. Remember Alfred taught us that aerobic respiration is using glucose, as an electron donor and oxygen as an electron acceptor transferring the oxygen, the electrons to oxygen 13:19:47 harvesting the energy, 13:19:51 the mattress vocation is essentially the same thing, except for you're doing it in the absence of oxygen hence it's anaerobic respiration, and now you're not using oxygen as an acceptor for the simple reason that oxygen is not available in the environment, 13:20:03 and you're reducing nitrate tonight. Right. 13:20:07 That's one step of the process of education. 13:20:10 So the nitrogen compounds in the system are not being used to make biomass that comes typically from ammonia, they're simply being used to get electrons is their electronic centers. 13:20:25 And in the environment. 13:20:27 It turns out that nitrate acts as an electron acceptor and that is reduced to nitrate, which then acts as another electron acceptor which is reduced to nitric oxide and nitric oxide and ultimately into gas cannot be further reduce. 13:20:41 Okay so this process of being edification when a microbiologist refers to it, they're referring to the full cascade of all of these reactions. Each one of these compounds behaves in exactly the same way in the sense that it accepts an electronic step. 13:20:55 Okay, so it's almost like you're just creating more electronic sceptre for yourself as you do this reduction. 13:21:02 And another reason to use this system as a model system in the laboratory is that the molecular genetics and entomology of this whole process have been really well understood. 13:21:14 A lot of work by a guy named Walter zoom in Germany, but many others as well he wrote this incredible like hundred page review article where you can read about the horrifying gory details of this whole process. 13:21:26 Okay. So the point is we know the players we know the molecular players. 13:21:32 The other sort of fascinating thing that we yes Terry please. 13:21:40 This list concrete example of, no order 50. Okay, so there's these, these are the reductive pieces that actually perform these reactions, of which there are seven, and pretty much only seven I think there might be some insane outlier group that you know 13:21:57 nobody's ever studied. Then there's regulators and this is always a fuzzy area because you know a lot of them are involved in the loss of oxygen sensing the loss of oxygen so there could be other regulators involved. 13:22:08 And then these are transporters that are involved in moving the metabolites across the inner membrane in case for that. 13:22:14 So, so the one that's it, dedicated to just nitrate to not try all together how many, so there's, there's two phases. That's it. 13:22:26 And the narwhal phase is cytoplasm and the nap productive is very positive. 13:22:32 So how much variation is there in these reductive. 13:22:37 So in some in some is my knowledge hasn't been studied in detail and all of them in this product days for example this nitric oxide product is which is of keen interest for climate change reasons. 13:22:48 You know, there are three or four wheels that people have identified as being, you know wide, wide widely dispersed. 13:22:55 What about adaptation to the electron transport chain didn't yeah yes layers, yes so there's other side or Chrome's that are involved there are co factors for these reductions. 13:23:07 That's why I'm saying 50 jeans it's not, it's not just for one step, that's just not true, not true. Yes, so then there's gonna be some co factors for the non over decades. 13:23:17 awesome. Yes, and that's 5950, but doesn't also yeah yeah yeah for what for that one step when I say 50 I'm saying the whole cascade Oh, sorry. Yeah. 13:23:31 It was invented many times. 13:23:34 I don't I don't know I don't know the answer to that, to that question. I know that there's substantial variation in some of these product aces, and you know some of these regulators for example or not just involved in regulating 13:23:48 easy SNR a&r those are CRP like regulator. 13:23:58 Slightly naive question but since, since his genes are known, could you In principle, try to do something like a flux balance monitor to make the prediction to the problem that you were saying before, the problem with the finance model in this context 13:24:13 is what we're going to do is bad culture it's now there's two real problems. One is we're gonna do bad culture experiments and the basketball for experiments have this dynamic problem which we discussed last week. 13:24:22 So so but we're not doing steady state experiments. 13:24:24 The second issue is that we've looked at the Pseudomonas original so Pseudomonas Aeruginosa is a folding knife fire, and you know the model is not really designed to look at the United vacation and in many cases does not include a very positive space. 13:24:37 So this is really important because for example this metabolite needs to be transported out of the cytoplasm into the Perry plasma plasma on so, you know, first for, you know, one model organism we can get somewhat sensical stuff out of an FBA model, 13:24:49 but we haven't used it widely. for that reason. 13:24:55 Another question so just ballpark numbers for grocery. 13:24:59 Yeah, so how fast you're growing aerobic Lee and when you're using I train I try how fast aerobically on rich media in the lab something like an hour, an hour and a half doubling time and in anaerobic conditions something like three to six hours. 13:25:15 Yes. 13:25:17 Yes. 13:25:17 Yes. What if. 13:25:31 No, just just just supplementation how fast. So many of the many D night fires are not for mentors. Oh, so So a lot of this is the dogma, at least I think you can find it the United fire that is it for a mentor, but the sort of document a literature is 13:25:32 that the United fires don't typically do 13:25:37 the United fighters don't typically do call it does the NRA which is a slightly different process but it's related so there's a, there's an offshoot here where you can actually not reduce nitrate to nitric oxide but instead to ammonia. 13:25:47 That's called discriminatory nitrate reduction to ammonia, the NRA and that's what he does. 13:26:01 Proceed. 13:26:03 Alright, so the phenotype Sunday night fire. 13:26:07 But it does the first step. Yeah, it does. The first step. 13:26:07 The natural fires have been looked at in several culture collections and in fact there's one beautiful paper like as 2017 where they just isolate a bunch of strains from some soil samples and they phenotype them as sort of fascinating thing is that you 13:26:20 can find strains like Pseudomonas original source let's try that do all four steps. These are called folding knife fires. 13:26:29 But you can also find strains that for example just do the first step, just do the last three steps. You can find strain that'll do the first step and the third step but not the second step, you can find a strain that'll do the second step and the third 13:26:40 step but not the forest, and so on. Okay, so the one sort of relatively hard constraint, but not total constraint is that if you do nitrate reduction you tend to do nitric oxide production because nitric oxide is side of toxic. 13:26:55 You don't tend to see strains that produce nitric oxide without also reducing into nitric oxide, which is this guy. 13:27:03 Um, so when you see those that have have a multiple steps are they actually operating at the same time or. it depends on the screen. 13:27:15 We can have a discussion over a beer after this about that it's very interesting question which were thinking about. 13:27:20 Yeah. 13:27:22 I can assure you it's not just in time. 13:27:27 Any of these strains that do only some of the steps. 13:27:31 Is it safe to assume that someone else in the community has to be doing other ones, or is there a reason to only do so again I think the answer is not really truly known, one would think that, you know, if one was doing, You know, one is doing this step, 13:27:47 that there's somebody reducing this knife is oxide but another thing to note is that these last three compounds are gases. 13:27:53 And so their last at summary and you know this one in particular is a greenhouse gas. 13:28:01 So, I didn't answer your question totally but I think the answer is not really no, so it'd be nice to try to study that in the wild. 13:28:13 Oh, the PK of nitrate night right off the top my head. I don't know, look it up after. 13:28:22 OK, so the notification tends to happen in the wild at interfaces. 13:28:29 This is a there's a beautiful review paper by James tg from the 80s, the sort of an incredibly prolific civil engineering guy who studied still around actually done some beautiful microbial ecology work. 13:28:40 So his claim it's you know it's borne out qualitatively is that DNA replication happens at aerobic anaerobic interfaces in the environment. Okay. And the reason is simple. 13:28:52 It's because the aerobic process of nitric vacation converts ammonia to nitrate. And that's an obligate aerobic process, because they're using oxygen as an electron acceptor to oxidize ammonia, and then that nitrate, the idea is that it refuses across 13:29:17 anaerobic aerobic interface, and then they do natural fires or they're happily sucking up that nitrate and turning it into the nitrogen gas. So if you look at environments that are interfaces soil particles sediments auto showed an example of the ocean, which has the 13:29:25 stratification pattern where you know you have oxygen and that no oxygen at that interface you will often find the natural fires. Okay. 13:29:32 And so by implication at least the assumption is that this is happening in in relatively strongly fluctuating environments. 13:29:41 And here's some examples so it's very important in soils and agricultural soils is a very very important process you can lose your fertilizer if there's too much at night. 13:29:50 And in the oral cavity it's also happening quite a bit where it can produce nitric nitric oxide which as physiological effects. 13:30:00 So that's the crash course and the unification. I'm gonna stop talking about the details now is there other other questions about the sort of physiology or ecology, to the extent that I can answer them. 13:30:13 Here's this. Yep. 13:30:27 Um, I don't know, I think what the way to ask that question would be to go to Pseudomonas Aeruginosa Pseudomonas that's right and look at the electronic sensors that it has the machinery for, I presume that there are other oxidize compounds like sulfate 13:30:38 or something that it might be able to, but nitrate is energetically preferred over over many of those electronic. 13:30:46 Right, right. Oh sorry I should have said on the Redux tower after oxygen there's an iron reduction that's also energetically beneficial but that's iron, And then nitrate is right there. 13:30:56 And then you know sulfate is much worse in terms of the reduction in the regulatory wise, it uses that it follows the hierarchy. It turns out, I have no idea, given multiple choices. 13:31:08 I have no idea. 13:31:12 Right. So, what happens is the Arvind is saying these do respiration if they can without oxygen and that's true so the presence of oxygen suppresses the expression of the night. 13:31:24 And actually, there's a fascinating thing here there's no biochemical reason you couldn't do it in the presence of oxygen. Okay, the enzymes are not oxygen sensitive. 13:31:32 In fact there's aerobic teenager fires at run these enzymes in the presence box. 13:31:38 So yes it's regulatory what happens for the other accelerators, I don't. 13:31:46 Okay, so here's the strategy that that current have pursued. 13:31:50 So the strategy is go to the cornfields outside of UIUC before we moved in isolate a bunch of the United fires quantify the dynamics of the metabolites in the lab will turn out to be a fair bit of work but not impossible, and then sequence and annotate 13:32:06 those, those strains and try to make a prediction try to make a statistical prediction of the United States dynamics from the genomes. 13:32:17 So here's a library so Karnak constructed a library of something like 80 strains I'm going to show you data for 60. 13:32:24 Today, and those strains are comprised of these three phenotypes. 13:32:29 And so we're going to ignore the last three steps of the nitric vacation because measuring these gases is is a pain, but measuring the soluble substrates is is not a pain. 13:32:41 And what kind of found was of this library about 60% or so or what I'll call a non or screens ease do both of the first two steps. 13:32:50 A quarter on our strains, meaning they only reduce nitrate but not nitrate. 13:32:54 And we have just for strains that only do the second step. 13:33:00 And here's the final genetic tree of those strains. Their alpha, beta and gamma pro to bacteria. It turns out pretty good bacteria sort of dominate the natural fires, but there are certainly united fires that are in our phone file like grand positives. 13:33:16 They exist. 13:33:20 Okay, so here's the kind of data set that is able to collect on these strains, let me just walk you through this block because we'll be seeing a fair number of them. 13:33:28 Here's a northerner strain it does both of the first two steps and plotting the concentration of nitrate and nitrate as a function of time for 64 hours, done by manual sampling. 13:33:39 And what you see is that the nitrate goes away, the nitrate increases, and then goes away. And at the end, it's all over. Okay. And at the end point Karnak goes in and measures the OD, and he knows the starting od. 13:33:51 We're not tracking the OD dynamics and time. In principle that could be done. There's no reason we can't do that but at the at the scale that we're working in this case it's just too slow. 13:34:00 So here's an R strain reduces nitrate makes nitrate here's a nurse drain reduces nitrate. It's over. 13:34:07 Okay. 13:34:08 So these are the types of experiments that we're doing, and from these data there replicates here actually this is the reproducibility is high, so there's like three or four replicates here You almost can't see the differences. 13:34:31 There are replicates here actually this is the reproducibility is high, so there's like three or four replicates here You almost can't see the differences. 13:34:32 I tried is toxic when the pH is low pH here is 7.4 and the buffer is strong. 13:34:38 Okay, so that's important. 13:34:40 But and finally we'll talk about pH which is fascinating. But here we've removed that from the equation because there's just too much else to deal with. 13:34:50 First, 13:34:59 That's basically identical to what many of the theorists here Pankaj kale and Daniel and many others have studied Ben. 13:35:07 So, we have the growth of biomass, this is x dot. Okay. And there's basically three parameters that is a rate of nitrate reduction, a biomass yield, this is od per unit nitrate, and then infinity parameter. 13:35:24 And then we have two equations covering the time dependence of a and I was a nitrate at night sky respectively which we have measured here. 13:35:34 Okay. 13:35:35 And I don't want to belabor the point but essentially, the case are almost always quite low and micro molar lower we can't really constrained their values with this data. 13:35:46 So when everything that I'll show you we just fix K to a constant value for everybody in the library. 13:35:50 Okay. And the reason is that if you change K, the fit changes right here by a little bit, but we don't have good enough data to say whether it's you know how that Kurt has sharp that curve is so it's gone. 13:36:02 Yes. 13:36:12 Can you go, yeah. So in the previous slide, it seemed like you had one guy who was doing some kind of dioxins shift right the guy who could do both depleted all the. 13:36:21 Yeah, yeah trade first Yep. Yeah, so how does that get captured by the model. 13:36:25 So basically the model. So we have for an honor screen, we have two terms, right, we have a nitrate reduction term and a nitrate reduction term. And in order for the model to capture this behavior, it essentially has to match the RA and the ROI such that 13:36:39 the ROI in this case is a bit slower. 13:36:42 That's the only degree of freedom that it has to capture the thing I see so you're, you're matching it based on the parameters, not through some kind of multi variable. 13:36:51 No, no. So, so the way the way the parameter influence works is, hey is gone. We can't say anything about hey we just fix it. 13:37:09 And then the only free parameter in this od fit is RA and ROI, or two. 13:37:14 Okay, in an ER nurse brain. 13:37:17 And I'm actually not showing you all the data we actually for each strain Carla initialize is like six initial conditions with different nitrate nitrate, different starting od and so on. 13:37:25 And we globally optimize the model over all of those conditions at once. 13:37:29 Okay, with to essentially two three parameters. 13:37:35 So we're fitting something like 60 data points with two parameters. In essence, do you ever see appreciable lag in case where you have this dioxide shift like behavior. 13:37:50 I think corner could speak to that better. I think the answer is no. 13:37:59 So you're asking, Is there a situation where essentially a slower ri parameters and capable of fitting the data in this kind of a situation where there's accumulation. 13:38:12 Carbon sources and. 13:38:17 Yeah, yeah, I think we sent me some we don't see evidence of that, but it's not to say it's not happening. 13:38:25 The carbon source is sucks and eight it's not preventable. So we're not looking for it. 13:38:37 I just want to have a clarification question. You also measure population dynamics right fit. So we only measure od at the end. And we know the OD at the start, and we know how much nitrate or nitrate they consumed and from that we infer this field parameter, 13:38:47 which just has units of od per millimeter. Yeah, but from this you can also look back at your uniform your filament You can also compare ram Have you done and we did that in response to a very thoughtful reviewer, and we can predict the dynamics in monocultures 13:39:01 quite well. Okay. Okay so Carnatic samples at different times. The model predicts the OD and you know I can pull up the slide for you but let's not stop right now. 13:39:11 So I guess one more quick question so is it fair to say that you are treating these resources as substitute to double resources in your screen, yes I'm sorry I maybe I maybe I should be using the partner as the example it's just a little complicated. 13:39:21 Yeah, so in our inner, there's a growth term for nitrate and an additive growth. 13:39:32 Right. And the assumption is that they're both electronics up there is for carbon, there's, there's nothing. 13:39:37 You know, in chemical inhibiting them from us. 13:39:37 So you're simulating directly through a substitute double. That's right, that's right. Okay. 13:39:52 So, for an hour, or it's, it's not an easy question to answer for an honor stream because you have to inhibit nitrate reduction somehow. Well, I mean, the, the first stage of the second stage. 13:40:05 Oh, the relative contribution to the total growth rate for a non Nursery in a, in a phase where there's a place when I trade run out right. Ah, so. So if, if you have a growth curve you can see whether that's very different or just pretty much the same 13:40:19 thing. Uh huh. 13:40:21 My recollection from the parameters which will the values of the parameters that we'll see in a minute is that they're on the same order. 13:40:27 Okay, that the growth rate for nitrate and nitride is similar, and you think the call utilized. It depends on the screen that's what I was trying to say like this data your show. 13:40:37 Yeah. For this right so for this one strain but we have 60 strains I have plenty of other strains where you know this red curve just stays flat, and they are co utilized. 13:40:46 That's what I'm saying. Okay, so in some. In that case, ROI is going to have to be faster than our way to effectively described this data. 13:40:53 Right. 13:40:54 Does that answer the question. 13:40:59 The assumption here is that the current just says that the assumption here is that the substrates are being utilized just at a slower rate, meaning nitrate is being utilized at a slower rate. 13:41:08 Right. 13:41:10 Yeah. 13:41:12 I'm it 13:41:18 doesn't imply actual biopsy that just means so close ups, like, it is and it's it, the model captures it through, adjusting these two parameters in a way that you know we leave to the data, it's okay so there's no like repression of the other pathway 13:41:32 or any business. We haven't put that in the model one could put that in the model but then one has too many parameters to feel it. 13:41:41 Okay. Note to self show more examples of these guys and include Garner as the model example but maybe that would be that would clarify. 13:41:53 Okay. So, 13:41:57 over a period of many months Derek and current perform these types of experiments in many different initial conditions for all 60 streams actually for at strains. 13:42:07 Okay so this this data set really becomes the engine of this project, I think. 13:42:12 And I think it's a really valuable way to sort of disentangle what's happening in some of these processes so what I'm showing here are the values of these four parameters for these 60 strains, along with the phylogenetic genetic tree, and just to touch 13:42:34 on the variation so there's something like 100% coefficient of variation and each of these four columns, is they are, they vary by quite a bit. 13:42:39 And, you know, there's some phylogenetic signals here right like look at these guys are all sort of similar, maybe some jumbling in here by genetics. 13:42:50 Okay, so those are the phenotypes the phenotypes are now somewhere between two and four numbers to, depending on the street, it's too strange two numbers if you're an entrepreneur and for if you're an honor. 13:43:01 And so now we want to ask what is the relationship between those phenotypes and the genomic structure of strains, and so we whole genome sequence to every one of these strains which we have is initially annotated those genomes dug through the genomes 13:43:21 looked at the structure of the DNA education pathway in each genome. And what you're looking at here is the rows or species and the columns are genes, the columns are not exactly gene so we indicate presence or absence of not just the gene but as he was 13:43:31 asking about the CO factors and blah blah blah, which have a very high correlation right so we don't include all of these Emily. 13:43:39 So one you know Margie actually means an R g plus h plus I plus, blah blah blah. 13:43:45 OK, so the columns are zeros and ones and they indicate simple presence and absence. And, you know, to go back to autos talk which I said was an important inspiration for this talk, you know, auto has this adage use it or lose it, and the statement is 13:43:59 if you've got an enzyme to do something you typically do that thing. And in, you know, 78 of the 79 strains that's true we have one strain. You know that has one of these cases but does not perform that associated reaction but for every other strain phenotype 13:44:19 Yeah, just really quickly on when you said, It's not just the gene but also the CO factors. I assume there's some sort of like complimenting like if you see all but one and missing one you assume that the whole thing is there is there like To what extent. 13:44:36 There's that I think Karina should answer that question, with the microphone from hi me. 13:44:53 Yeah, there's a little bit of that going on in the case of the regulator. So for example, with the two components of Excel. 13:45:11 Yeah, if we observe that there is an annotated version of no RX or an RL. Then we have seen the presence of that and yeah that's really, that sort of thing is really restricted to that. Those regulators and under some, some of that going on the transporters as well. 13:45:17 avi how careful Do I need to be with the annotation like are you just blasting and saying this enzyme is sufficiently homologous or do I have to like look for active site residues and stuff like that. 13:45:28 So, we are leaving it up to rest annotation engine so it's trying some threshold and in terms of apology. 13:45:37 Actually, 13:45:41 if you build a tree from the presence or absence, these genes are hot different or similar is that from the phylogenetic tree that you were showing that you presumably got from 16 years. 13:45:52 I have not built that tree, I don't think you've built that tree either To my knowledge, but you can answer the question. 13:45:58 Do you have a sense. 13:46:00 Yes. Interesting question we haven't tried it I think my sense is that in many large regions of the, the string library, the privilege of a tree and the gene college actually quite similar but there are some parts were parts of this data set where phenotypes 13:46:20 different quite a lot in short phylogenetic distances and then you obviously see in disagreement, there is one thing I will say that of course level. The one thing I have done is if you just do PCA on this matrix. 13:46:30 The first two principal components separate these screens out by class. Okay, so at the class level, you know there's clear signal but then you know what kind of saying is, go to these guys and there's a fair bit of difference and that's also reflecting 13:46:42 differences in ra ra again. 13:46:46 Okay, so now what we want to do is connect the gene presence absence to these phenotype parameters that we measured in the lab and what we're going to do is take the sort of most naive and simplest yes Sergei, 13:47:00 always was actually confused why you want to work with presence or absence were you when you have a relatively small sequence space right. In other words, naively speaking you would expect the presence or absence to tell you nothing about the rates of 13:47:17 processes affinities and stuff like I think so, yeah, so the simple reason is dimensionality right presence absence has 17 dimensions. 13:47:27 And we have 60 data points. So if we want to look at promoter architect for a Leela variation all of a sudden we have you know 1000 variables that we're trying to characterize across the data set of 60 screens. 13:47:40 And, you know, we're in trouble. And so the simple reason is, let's see how much predictive hours, gained in presence out. 13:47:47 What do you plan to, you know, expand this at some point in the future to allow you for real. In Depth sequence analysis, I would love, I would love it in fact is one of the things that I've put on the syllabus of the module that I'm teaching right now 13:48:01 to look at Symphony, for example, or look at like variation. We've also taken the keg genome database and annotated them in the same way. So we have these vectors for 3000 genomes now we can start to look at these patterns with a bit more data, even in 13:48:16 the absence of phenotype, and absolutely it needs to be done, I think, know order one is how much information is there any presence. 13:48:25 So, thank you. 13:48:27 I mean, I would like you to do that. 13:48:32 So I guess to follow up on circus question in this presence at this thing does this assume that all of the parameter dependence in your model then is contained for enzymes that how I should think of it. 13:48:50 and that strain includes it it gets that, and it's, let's go let's go to the regression and I think it'll answer the question. So now we're going to do is a regression. 13:48:53 So, I'll just before you go on, because it's related to the previous question so I guess, following what man was Daniel was asking in the beginning. Oh sorry, and he was asking in the beginning, at least for the primary enzymes enzyme families among the 13:49:09 60 strains you have. 13:49:11 Do they have comparable sequence divergence across the 60 strains so if you look just at not in our map or in an IRS and irk, and you compare the sort of maximal divergence across your strains. 13:49:27 Haha, one family of enzymes more variable than others because that might be the case were going to suggest question, want to split it into two. Aha, raw. 13:49:33 Yeah, so we haven't done that analysis in detail here. There is some sort of funny work with cuna and Sinjar, which are two different enzymes that have different mechanisms that perform nitric oxide production. 13:49:46 Those are often annotated as the same Dean, and in that case what we do is we look at similarity, we use a large database to look at a lyric variation and we classify them on that basis, but we haven't done that, you know, line by line by line by line. 13:50:02 So here's the model, it's a very naive and simple. It says, here's the phenotype parameter and I'm going to predict it using a linear regression of gene presence absence, which is this binary number gij plus some noise and then I have a coefficient on 13:50:24 each team which tells me you know if I have that gene. My ra goes up or down depending on if beta is positive or negative. And this remember is just an observed quantity for each strain, I strange I. 13:50:38 Okay. 13:50:38 So you're saying that if you have the nerd gene, it can affect the uptake rate of the gene for yes yes so all of these deals going to the regression for each parameter. 13:50:49 So for example, the nurturing could be affecting the nitrate reduction rate. Yeah. Okay, thank you. 13:50:56 Okay, so you know this is one of these under determined regression problems you can't do well well us because you'll over fit so we do a lasso regression with a regularization which corner and model of, and I have dug into in some detail now. 13:51:09 And here's the basic results. 13:51:12 So these are the beta vectors, meaning the loadings for each one of those genes predicting that phenotype parameter. And here are the observed measured yields on nitrate and the predicted yields on nitrate and the sort of in sample r squared is point 13:51:29 seven five out of sample as sa by cross validation is about 50%. 13:51:35 Okay, so we think the model is doing a pretty good job more recently corner when God Pseudomonas Aeruginosa pa one did the notification on it took its g vector, and predicted its phenotype parameters very well. 13:51:47 Okay, so that's like a true out of sample out of freezer 13:51:53 experiment. Okay. 13:51:56 So for Ra. 13:52:10 You know the regression is not quite as good, so you know Sergei here you go right there is variants that is unexplained, certainly, certainly, and I'll make a comment about why I think this is sort of working I think there's multiple things that work 13:52:11 here. 13:52:13 But, you know, here are the results for these other two parameters the yield on nitrate and the rate on nitrate. 13:52:25 Yeah. 13:52:30 Okay, so similar. You know our words. 13:52:31 Do you see a strain which, which is really good at all for these phenotypes. 13:52:37 Yeah, so there's a. 13:52:41 We included a lab for and called para caucus United for the night terrific hands, which as it sounds might be a good at night for fire, it's extremely fast, it produces a ton of biomass relative to almost all of our other Iceland's. 13:52:54 So it's kind of a beast of the notification. So, such a thing exists. Yes, I see because I'm looking at all of these parameters, given that a lot of them seem to be positive. 13:53:04 Yeah, I would expect. 13:53:07 I would expect there to be a strain which has all, you know, 17 of these genes and Yahoo be really, really good at it. Yeah, something close to that, I think, I think the answer is yes and i think para caucus will probably be close to that boundary that 13:53:21 fair corner. 13:53:24 All 17 jeans and just 13:53:28 so you know we didn't just take this regression on faith, there's a lot of careful work that has to be done to believe one of these regressions. 13:53:35 So we did a lot of work to look at DNA replication genes, relative to random genes so if you randomly selecting from the genome with the same marginal this frequency distribution and uncorrelated with the DNA replication jeans their predictive power drops, 13:53:50 drops, if you include course genomic features like GC content size of the genome etc The prediction doesn't really change any substantive way. 13:53:58 And if you expand the number of jeans by just tacking on additional jeans, to the regression up to, you know, 500 or more, it doesn't really change the predictive power. 13:54:07 So it looks like, you know, at a relatively course level, you know, knowing which the notification genes you have is a reasonably good predictor of these for phenotypic era. 13:54:26 So it looks like there are some strains that have multiple genes within a stuff. Is that true and within a what like within, within one of the steps of the unification process. 13:54:38 Oh, yeah. So, that's true, yeah and so given that why do you expect this guy. Yeah, yeah, so why would you expect the model to be additive. We're like would you a priority expect Yeah, look, the reason we chose an additive model is for statistical reasons 13:54:53 not mechanistic ones right if we include quadratic terms, we're in. At that point, and we just don't, it's, we don't have enough data to learn right i mean but you could add a couple other, you know, you could add a like, does it have both is like a yeah 13:55:06 but it problem is this explodes and you can't do it in a rational way so either you do the simplest model, you know because I could say Oh Na GE is is regulated by an R XL so I should have across term there and then, you know, Dark One Two is responsible 13:55:18 for transporting nitrate so I should have an international. So it just, it just explodes and I don't think it can be done carefully. I think one thing I want to add to that point is we actually did try adding quadratic terms corresponding to do have both 13:55:33 energy and that that being the parent which is most, most frequently co present, and that did not add to the predictive power in the model. 13:55:42 It almost seems to me like you're already including like these are the quadratic terms, right, like the the stupidest linear thing would be, you know, if you have the gene or not do you do, do you have an RA or not. 13:55:53 Right. Is that a fair so so yeah so actually it's a little bit that that's included in the offset because we only include strains that perform that reaction in the regression for that phenotype parameter because otherwise we have no response variable. 13:56:05 This is sort of how the other genes affect that. Right, right, right. Yeah, it seems like interaction to me. 13:56:16 Did you try any of those are machine learning techniques like random forest for instance we are supposed to. Yeah, randomly well and comparable size of the data points, as I recall random forest underperformed last Oh, 13:56:25 May effectively look like an interaction term like Oh, yeah. Yeah. 13:56:34 I don't have any comment about why. 13:56:37 Okay podcasters know we're getting into, machine learning 13:56:47 all open some alcohol later and we can have a discussion about it but like, let's let's let's, let's go. 13:56:57 Okay, so, one might ask why is this working, you know, Sergey when we told them we were going to do this we were still in our band and he said, that's a bad idea, guys. 13:57:07 And you know, I think it's not an unreasonable reaction for him to have said that. 13:57:11 But it did work So did we learn anything from this so if you look at these regression coefficients and then you go read all the molecular biology literature about the pathway. 13:57:20 There are a few, but not complete stories about these coefficients that you can tell this is maybe at your peril. One of them is that you know this nerve enzyme which performs the second step, this, this enzyme is much faster than the other one or s. 13:57:35 And what we observe is a coefficient that's much larger in the regression so this is just saying that strains that have no k on average, do ri faster. 13:57:43 Okay. And that makes sense with what is known about mutants and Pseudomonas and wild type and isolates of other types that have either an rk or. 13:57:55 I think in part, there's a conserved phenotypic impact of some of these reductive. 13:58:10 are too correlated it randomly breaks the symmetry. Right, right. So how can you interpret features this way when things come co correlated so this is this is exactly why I'm saying like, one should not take these too seriously right one could expand 13:58:18 one's data set, and start to see this breakdown and then this just so story that I'm telling is complete BS. 13:58:26 That Asterix is there. 13:58:31 One also some sample the data and asked what the betas look like right that would be the sort of system. 13:58:37 We could such that the features can ask, real quick. 13:58:41 So the fact that here in RS has a zero coefficient that kind of only works because you only included yeah the strains that actually do that reaction. Yeah. 13:58:52 So, the other one adds to it but it's kind of fake interpretation this one doesn't contribute at all. So should I be worried about that yeah well so actually, it turns out that in the data and duress and rk are mutually exclusive. 13:59:04 You either have one or the other. This is the correlation that I'm talking about. And so, and that's actually known broadly in the literature that these are for some reason genetically repelling. 13:59:14 And so, in this case you've either got one or the other. If you've got ku get a boost if you've got sem. 13:59:19 So there's an effective correlation term in the data right, this is the issue that was pointing out and I don't mean to sort of overshadowed what I see, I see. 13:59:33 the strains here that you don't see do that reaction and expect your thing to predict a low value right this is combined So the thing is you've got a different beta coefficient for a different phenotype. 13:59:48 Right. So if I'm predicting a strain that does nitrate reduction I use a different vector for RA and our, our. 14:00:00 Three. 14:00:01 Right. 14:00:01 I just want to clarify something I think I'm misunderstanding something so if you repeat the fit, but you drop the presence or absence, the, the particular values that represent the enzyme to do the other reaction reaction that you're not fitting. 14:00:15 Right, so suppose you're fitting for rate of consumption of nitrate, or yield on nitrate and you drop nor km nor as from your vector. 14:00:23 Do you do as well I think the answer that question is no. 14:00:33 So that question is no. unrelated right because like related I'm guessing you lesson she crushed the system it will give you anything. 14:00:41 Right. But I think, I think this is, you know, I'd caution on to Step as statements here, we see that there are genes that are putatively unrelated to the process that hand in having some predictive power so I think it's not so simple as saying, which 14:00:55 is why I'm only saying drop the enzymes not any of the things that it's hard to figure out where and how they play out right like because presumably naughty and RS is the direct enzyme off that step from night right to nitric oxide, the current point 14:01:07 is that you know there's this transporter or these regulators and keep all that, I'm saying just drop the cells that correspond to enzyme but you know it has a specific role. 14:01:16 Yeah, don't toss anything that you can't predict one way or the other or may have mixed roles, and then the question is what the question is, does it do as well. 14:01:27 I don't have an answer for that question. I don't know if I don't think current does he know we haven't tried to, 14:01:39 This is slightly related to what James was saying but he for some of these jeans you actually coalesced some of the annotations right yeah. Yeah, don't do that but use all the genes. 14:01:49 Um, yeah. 14:01:51 Yeah, you can stay the same. So I think then the thing is you're often getting into annotation errors in some case cases right because if you have an RG, you absolutely need this molybdenum co factor that it's going to use right so, then there's a bio 14:02:03 synthetic pathway for that cool factor you might be missing one gene in that boss and that pathway but that's because we're asked blew it on it, right. 14:02:19 As you but you could still do the regression see how better, you know, do the regression and see how well it goes Yes, by all means. 14:02:30 Okay, sorry, I want to go on to the next part if that's okay, which is to, you know, I'm claiming that we have a statistical prediction from gene content to these phenotype parameters, which is statistical in nature, not necessarily causal or mechanistic 14:02:48 Lee that informative and what it means is we can go from sort of genomes to these two phenotype parameters that we measured and now what we would like to do is try to think about how strains and communities are combining to do nitrate and I tried reduction. 14:03:04 So here I think the you know the purpose of the model becomes crystal clear. So you take a model for an ER nurse strain. This is these four parameters, you take a model for an RF strain and you simply put them together and integrate them numerically and 14:03:19 this gives you a prediction for the nitrate and nitrate dynamics in batch culture, and there's again no free parameters right because we measure these a monoculture and then just ask. 14:03:30 Here's the and we actually did it this way for once. We made the measurements and monoculture, and then did it in culture. 14:03:40 And this is the result for this. 14:03:49 And so, the assumptions of this model are strong right the assumptions are that the only interaction between these two strains, is the cross feeding of these electronic sectors. 14:03:59 There's no antibiotics going on. There's no acetate insanity happening where you know lots of metabolites are getting treated and they're being exchanged. 14:04:07 It's simply a competition for these electronic sectors in a medium where carbon is not windy. 14:04:12 Yes. 14:04:16 How much does this change when you change your Ph. Okay, so that's a very important question which is like a whole nother study, which we are working on but you know come back in three years, for sure. 14:04:27 For sure, for sure. Yes. 14:04:38 And the question is can you use these consumer resource models to predict the outcome of a zero dilution experiment in sort of a rough pass I would say we had like an 80% success rate over you know 10 rounds of. 14:04:49 Cool. 14:04:50 So there were some failures. I don't remember the details of. 14:04:55 Okay, so one can do this, you know, sort of ad infinitum with pairs of 60 strains. And what you find is that anytime you combine a corner and enter or an honor and an honor the model works quite well. 14:05:11 Okay. 14:05:13 Anytime. 14:05:15 Okay so anytime you combine a naar with a nerve, you get this sort of catastrophic failure of the model the prediction is off qualitatively. 14:05:24 Okay. 14:05:24 And so corner did all pairs of 12 strains, where he combined in our inner winner and so on. 14:05:30 And the heat map is showing you a quantitative measure of how good the model is fitting the data. And there's this sort of beautiful pattern here right there's the nerve strains with other nurse strains works pretty well all of these quadrants work but 14:05:43 the in the gardener in the naar plus dinner. Even I still have trouble with it. 14:05:49 You know the is lighting up and what's happening in every one of these cases is that the nitrate reduction rate is slower than the model expects. 14:05:59 Okay, So this is always happening. Oh, sorry. In this case, so we haven't dug into this this is I think again a separate study that one could engage in. 14:06:08 So why is this happening. 14:06:11 One reason could be that these strains are producing nitric oxide that nitric oxide is side of toxic and these trains are not dealing with it well. 14:06:19 And the sort of genetics of the strains that are most strongly inhibited are consistent with that because this guy for example lacks the nitric oxide production is to alleviate that toxicity. 14:06:32 You know the shaking is is serious here right we're taking a serious, this is a conjecture. it may be something else. 14:06:40 We have sort of qualitative results here. I mean I think this is definitely true that the nitrate reduction rate is always lower than the model would have you believe that it should be. 14:06:54 Okay. 14:06:54 Okay. 14:06:58 How much do you think that is about just that strain and about the whole phenotype because the way you do the fits that strain probably dominates the whole quadrant so I. 14:07:10 Is it really about naar box with near or is it just that that stream dominates the fits because the this strain. Yeah, because because it has similar enzyme contents as the other night. 14:07:23 Yeah. So the thing is, right so you're asking me to do it with more effectively which is all I can do right now because I ran out a nurse grants or not or drop that one stream refit everything and see if you can predict better know so i mean that still 14:07:38 work okay so like this strain right here is also, you know, five or six on this normalized scale is quite large, right corner wants to read about you 14:07:51 There may be a small point of confusion here the predictions are not based on the genomes, so there's no, there's no additional. There's nothing happening with it where if you drop a strain the regression is or predictions that are based on the phenotypes 14:08:05 that were measured in monoculture. Yeah. That's important. Sorry. So we're using the monoculture phenotypes not the genome predicted types because that would incur additional or. 14:08:12 Thank you. Yes we clarified this in our paper, and I haven't clarified it yet. 14:08:21 Okay, so you know everybody wants people people have asked us like what if you do three four or five string communities, and you know it works pretty well and three four or five string communities as long as they don't contain an arm or pair. 14:08:35 But it turns out we can sort of contain the failure mode of the non repair by treating them as a single effective species so we can make a three species community prediction by resetting the pair culture data from the non repair that did work and treating 14:08:50 that as a single effective species in a three species community. And then the model predictions are sort of recovered so basically we're just giving the the model additional data, and then a prediction is improving perhaps that's not all that surprising, 14:09:04 but in essence that means that the non repair having this non additive consumer resource interaction is then not messing with the interactions with the other strains and some, some serious way, that's all that same. 14:09:17 And, you know, we sort of did this all the way up to five string communities where we developed a heuristic for dealing with the fact that there are multiple learner pairs in a community of say four or five strains I don't really want to do good, but 14:09:29 but but that continues to work so when we use that pair culture data from learner strains. 14:09:34 We get good predictions for metabolite dynamics and communities of say four or five strains in terms of, you know, serial batch culture I think it's an interesting question. 14:09:52 As you were saying the goodness of this film tells you that these trains are interacting rather not the goodness of fit the fact that you can predict the community phenotypes from the individual ones tells you what that the interactions are primarily 14:10:07 just these electronic sectors. Yes, but also something about the dynamics, not being changed by them being captured the thought, well I mean there's a bit I was just explaining this to the students this morning right there's a bit of a selfie here right 14:10:20 right like if one strain is changing the a and I concentrations. 14:10:25 Then the other strain is acting on those with, with its phenotype parameter RA and ri. 14:10:29 So that's strange maybe growing less, right, because the other screen is eating too much of the nitrate and the model is getting that correct. 14:10:38 The model doesn't have that and yet it's getting it doesn't have it just each strain has a and I and then it's for parameters and the models getting that correct right, there's actually many other things the models getting right which I should maybe put 14:10:49 in the talk. If you predict the OD of pair cultures at the end of batch culture or three species cultures or five species cultures model gets that Right, right. 14:10:58 So, in a select Set of air cultures that we measured actually gets the relative abundances of the two strains correct in our culture it's a bit of a pain to measure, but we did that recently and it's also getting that right. 14:11:09 Okay, so I think, at least in this regime where the pH buffered and the carbon is an excess, the you know the predictions are quantitative barring this nor plus in our failure, that's the that's. 14:11:23 That is that clear. 14:11:31 Okay, so, you know, that's it for this story for now. I think I just want to say that you know we, I think we sort of made some progress in mapping the the sort of gene presents absence to the community level metabolites in a quantitative way and honestly 14:11:45 I'm quite shocked that this works. 14:11:49 And really proud of the incredible work that current has done to make it happen. I think there's a couple lessons that I learned through this project it's been really important for me. 14:11:58 And one of them is that you know it really helped me to look at a well defined ecological phenomenon, to learn about that phenomenon and then try to explain something about it I, I feel like that is a lesson that I will take for a long time. 14:12:11 The other thing is to pick a system that's tractable unquantifiable right we can measure these metabolite concentrations without resorting to, you know, you know femtosecond spectroscopy or something more horrifying right so that's important. 14:12:23 The other thing is we can you know Carla and Derek isolated these strains from soils in six months or something, right so it wasn't a complete disaster. 14:12:32 And then I think, you know, one of the other lessons from this study is, let's look at the phenotypic variation in the wild I credit Oliver on Pankaj for really pointing this out to us like, let's look at that phenotypic variation and try to explain it. 14:12:48 quantitatively from what we can understand at the genetic and hopefully eventually like my molecular level. When we can do that then I think we start to really make progress. 14:12:57 So that's my, my pitch for why this is the correct model, a correct model system not 14:13:07 bad. 14:13:18 Thanks for was super interesting and clear. I guess the one thing that I'm still puzzling over is this business of using both those nitrogen species simultaneously. 14:13:27 Yeah, that, that just would not have been my expectation Yeah, now you believe that you regulate and you choose Yeah, yeah, I'm Deanna false your thoughts on it. 14:13:36 Um, so it's the next study right okay so there's two next studies that are sort of in the pipeline one is why split a pathway I'm not gonna talk about a current as I'm going to talk about it. 14:13:45 The other is why excreta metabolite okay so this is what you and Terry I think are sort of thinking about this strain is for some reason letting go, the list. 14:13:54 that this strain is probably up regulating nitrate reductive is that is the enzyme that does this stuff right away when oxygen goes away. And this train is actually waiting for nitrate nitrate to be fully consumed before it turns on that part of the path. 14:14:12 So in one case you have turned on the whole pathway at once, and another case you have this turn on the first step, turn on the second step, turn on the third step process. 14:14:24 That's speculation, that's sort of where we're headed with that question, I don't know the answer, but I think it's a sharp question. 14:14:30 Okay, I'm done with the notification and I have 15 minutes to do a whole nother manuscript so we'll see if we can do it. 14:14:44 Okay, off the rails Ready, here we go. Okay, so the second project is about these close microbial communities, you know, community cycle nutrients. This, this happens locally in space so for example there's a sulfur cycle in the ocean. 14:14:59 They've actually talked about cycling and math. And these closed ecosystems, I think many of you know the beautiful work of Stan and Duka, and this work that Zach and I did was in 2015 on these close communities. 14:15:14 I think I've realized recently in the context of this idea of looking at better metabolite flows that these systems are really nice systems for asking, how do you build a community that is capable of regenerating nutrients continuously, what are the constraints 14:15:27 on a system that come out, because of the requirement that nutrients are regenerated. 14:15:33 And I'm going to skip this background, even though it's quite interesting, we can talk about it later. 14:15:37 So here's the here's the basic question. 14:15:40 If I have a system that's going to cycle nutrients. What are the constraints on the organization of that system. 14:15:47 And so what we're going to do is develop a new method for quantifying carbon cycling, and then we're going to assemble some closed ecosystems and try to interrogate how they accomplish this this carbon cycling in these posts. 14:16:10 So, just for clarity. These systems are hermetically sealed, and they're provided with only light, and they have some auto traffic species that can fix co2 and make, you know, reduce organic carbon, so I should say that committee and collaborated with 14:16:13 police on this project over the last years. 14:16:17 Wow. 14:16:29 Okay, so just the basics of a carbon cycle in a closed ecosystem. When the lights are on, I have an auto Trove alga cyanobacteria whatever you want. What it does is it takes carbon dioxide and makes organic carbon and produces oxygen. 14:16:33 Okay. And the oxygen can then be used by the head retrofit component of the ecosystem the bacteria whatever you take that organic carbon and turn it back into co2, that's aerobic respiration anaerobic respiration. 14:16:46 And when I turn the lights off this process of course stops, is the photosynthesis is light. 14:16:53 And so what that means is that if I have a system that's driven in a light, dark cycles that the oxygen will rise in the light phase, assuming this rate is faster than the respiration rate. 14:17:06 And then when I turn the lights off only respiration is running. That's consuming oxygen, and the oxygen level ball and carbon dioxide has complimentary dynamics. 14:17:20 And so, roughly speaking at least the amount of carbon cycling in the system is proportional to the amplitude of these oscillations driven close. 14:17:32 Okay, so we want a method to measure the dynamics of oxygen and carbon dioxide in our hermetically sealed vessel and we want to be able to do it for a long time. 14:17:41 There are phosphorescent fluorescent methods for doing it, we've worked with them, the stability is poor, and so on. So we have this alternative method which was inspired by the work of a guy named Don urban Hoover who now works at the FDA and I'm in 14:17:54 contact with a very clever idea. 14:17:57 What he realized was the following. 14:18:00 If oxygen is produced, that the pressure in a sealed vessel will rise. And the reason for that is simply that the oxygen is less soluble in water than carbon dioxide. 14:18:11 Okay, so the Henry saw coefficient is about 30 fold different. So if you take carbon dioxide and you convert it to oxygen, a larger proportion of that will go into the headspace of the sealed vessel, then we'll be remained soluble eyes, whereas the carbon 14:18:23 dioxide is higher solubility. Okay. That means that a change in pressure is proportional in not a simple way to the change in oxygen concentration in the closed ecosystem. 14:18:36 So thanks to all the cell phones, we can measure pressure differentially at least into exquisite precision. So here for $10 on Amazon, you can buy this memes based pressure sensor. 14:18:48 Literally $10, this is literally the one we use we glue it inside the cap of this thing. And the sensitivity of these things as a part per million of an atmosphere which corresponds to a change in altitude of about five centimeters. 14:19:00 Okay, so this is a beautifully sensitive device, the accuracy is terrible. 14:19:05 What. 14:19:08 Okay, maybe. 14:19:11 Yeah, yeah, it turns out the noise is, you know, bigger than 10 to the minus 6% That's okay. 14:19:19 Okay. And so if you take a closed ecosystem and you put it in these conditions you turn the lights on the pressure goes up you turn the lights off the pressure goes down. 14:19:29 And you of course now have to do a bunch of control experiments to believe that this is actually reflecting oxygen, the easiest one to do is to actually measure oxygen currently. 14:19:38 When we do that, we see a correlation of point nine five. 14:19:40 Okay. 14:19:41 The other thing you can do is just put water in there and turn the lights on and off and then you see a flatline. 14:19:49 Okay, so those those has to be true. And the hard part of this experiment is the temperature control has to be really good because of course, you know, the ideal gas law will get you every time temperature changes. 14:19:58 So 14:20:02 I'm going to skip the details of how you calculate the carbon cycling from the change and pressure, just because, you know, talk to me later. 14:20:09 The change in pressure is proportional to the change in oxygen concentration, was an ideal gas law term and a Henry's law term, the actual calculation involves accounting for carbon dioxide and that is difficult, and such and we did that. 14:20:23 I'm going to skip this, the interest of time, and there are some assumptions here, which mean that the carbon cycling rate that we measure, which we can do essentially once per day night cycle is only good to about 50%. 14:20:38 So I wouldn't take it seriously better than a factor to. 14:20:43 Okay, so here's the experiment that we did. 14:20:45 We went out to the yard again. This time we didn't isolate individual strains, come to the extracted bacterial communities from soils, killing the fungi areas. 14:20:57 He mixed it in a buffer in the lab with the domesticated alga clinical honest right hardy is she popped it in one of these devices. So there's two soil samples and for replicates from each device. 14:21:10 And here's the time series for 50 day night cycles for a single closed ecosystem. So this is the pressure as a function of time and the orange bars are telling you when the lights are coming on the dark bars tell you when the light school. 14:21:23 And here is the time series over 50 days of the carbon cycling rate inferred from these oscillations. And these eight curves are for these very complex soil derived communities that we put in there. 14:21:37 The green curve, which is identical zero because we can't measure any carbon Cycling is when we just have the LG by itself. 14:21:44 And the red curve is when we take the algae and we add it was fine. It was equal I anesthetics. 14:21:49 Okay, so the the sort of punch line is that when you, when you just throwing the kitchen sink. 14:21:55 The carbon cycling rate goes up, I mean maybe not completely shocking. The idea is that there's additional metabolic capabilities in this in this very complex oil drive community that enable this carbon that's produced by the algae to be recycled. 14:22:13 Know It continues to go down i mean you know if after 50 days we get bored. But, yeah. 14:22:21 So there's there's chemical reasons to believe that all I should never be able to cycle with common among us and the reason is that the minimum possible stores start store carbon is starch. 14:22:31 Any call I LX Emily's 14:22:37 Sorry. 14:22:42 Oh, so it's not from this time trace is actually sort of a discontinuity that happens in the pressure there. I don't know, honestly, this, this reproducible feature of the pressure declining. 14:22:49 rush around 32 days or something, white, in a white curve. This one yes. 14:22:55 No, no I'm referring to the bottom panel, one of the curves white seems to be doing fine fine fine actually better better and then over one day crashes. 14:23:05 Yeah, you see this as what I'm saying is the pressure does reflect that and I don't know why 14:23:12 you don't believe it's like a failure something could be staged in their 14:23:19 pages everywhere. 14:23:25 right roughly a factor too Yeah. 14:23:29 It's a bit better than that because. So it depends on this depends on pH this conversion factor, and the pH is buffered again. And we measure it at the beginning in the end and it changes a little bit but not a lot. 14:23:43 So that that means the variation in this inference is better than a factor to probably and 20%. Then there's this factor of the photosynthetic respiratory coefficient which is still a geometry between fixing oxygen or fixing carbon dioxide and producing 14:23:57 oxygen in the converse, and now we really don't know. It's typically about one but it can be off by a factor of, you know, 30 40%. 14:24:08 So does that mean I should believe the relative, much more than the absolute, is that how I interpret this thing, that, that if I look at ratios. 14:24:17 Haha, yeah. Leave that much more than the absolute number I just want to make sure yeah i think that i think that's fair I 14:24:28 want to look at how these systems are accomplishing this carbon cycling on longtime skills. So incidentally we let some of these go for eight months, though we had to shut down the lavender banner Chicago and they continue to cycle and actually one of 14:24:45 started to improve. 14:24:49 Then they got put in the freezer. Okay, so we wanted to sort of characterize these communities a little bit, look at who's there, and maybe try to look at least as a first pass at what they're doing metabolically. 14:25:01 And then we had sort of two ideas. One was that, you know, the soil samples all harbor similar functional taxa which we learned from the Lorna's papers, so maybe we're selecting out you know attack so that plays nice with the algae and cycles the carbon, 14:25:14 or maybe there's some sort of community level property that's more important, and there's actually you know many distinct combinations of taxa that are able to cycle carbon presence of the algae. 14:25:28 The problem with this is we have to harvest biomass to answer any of these questions. So the only way we can harvest biomass is to open the system, take out a big chunk of it, and put in fresh media and let it go again. 14:25:38 And so that's what we did. 14:25:42 So here are the pressure traces. 14:25:45 For three cereal dilution events, it remains, about the same here it sort of starts to seem to drop a little bit. 14:25:52 This system crashes, the number of rare tax is actually going down as we do this delusion. Okay, we can see that in the data. 14:26:04 But we, you know we're trying to keep the delusion to a minimum but we need to harvest material in order to make. Okay, I have to play moderator know Just so you know, to 25. 14:26:13 Right, thank you. I'll wrap up. Okay, so here's the sequencing data this is 16 hours applicant sequencing data that company acquired. And, you know, basically you can see that some of these strains from one so some of these communities from one soil sample 14:26:26 look pretty similar, but then if you go to the other soil sample you've got one that's a completely different, the similarity between the two soil samples is quite low. 14:26:38 Overall, and, you know, Cambodia and I spent a lot of time quantifying that similarity using you know all the different metrics that everybody uses. And the point is to say that the differences inter community that is if I compare it to close ecosystems 14:26:49 that were initiated separately. There taxonomic variability is higher than intra both ecosystem. Maybe that's surprising. 14:26:57 But here's just sort of an embedding of those similarity metrics agenda, Shannon divergence in this case. And you know what you see is that there's no like strong convergence it's not like they're all going towards the same point in terms of their taxonomic. 14:27:11 So, the point is simply that there's different colored species in there, and you can look at what they are and stuff and there's, there's some, there's some stories that are that are maybe a little bit. 14:27:30 So it doesn't look like there's you know one variant that's being recruited in every strain in every community guess what taxonomic level where you showing there so it turns out of course green all the way to 90% similarity 16 so the result hold that 14:27:52 are more different than the two between each other than they are interrupting unity between. Okay, so there is no failure order level right so you know that was the first thing. So what has to be happening for the cycling to occur is that the organic 14:27:56 carbon has to be converted to co2 through the process of aerobic respiration that has to be happening for the oxygen to be consumed, and the co2 to be produced. 14:28:07 And so what we did was, Luis painstakingly took the samples you know as a kind of a one shot experiment right because you get the sample after is you put it on these bio log plates which measures the ability to require 32 different carbon sources in triplicate. 14:28:22 Okay. 14:28:24 And then what he does is he measures. The absorbent of a DI which is proportional to the respiration rate in the presence of that carbon source. 14:28:34 So it's almost like a perturbation experiment, you take the whole community you put it into play and you say, Are you guys perspiring, you know, black and black or whatever carbon sources in that well. 14:28:45 And then you measure a timescale of that respiration for each carbon source for each community at each time point. 14:28:53 And then what we do, you know if you if you utilize it slowly, you obviously have a small one over towel and converse. 14:29:04 Okay, so now we're looking at is the columns are closed ecosystems and the rows are these nutrients and the heat map is the towel. And we do this at each round of this enrichment process. 14:29:15 You can see this sort of simple pattern seems to emerge, which is that the communities are becoming sort of more similar right they're all being down regulated for this particular carbon source up regulated for some of these other problems. 14:29:29 Okay and the relationship between these carbon sources and whatever the algae are producing is not really known so we have some quantification of the carbon sources the algae is producing but the relationship is not quite clear, but we can sort of quantify 14:29:42 this is a correlation between these carbon utilization profiles between closed ecosystem. Is it just saying that they're, you know more similar to each other. 14:29:51 After this enrichment process than they were before. 14:29:58 Okay, so you know the song not the singer i think is kind of the story here you've got this carbon cycling process that's happening in the community. 14:30:07 And, you know, taxonomic Lee, even at you know the class level these communities are not all that similar, but there is this sort of functional similarity in these communities that are organized under this cycling. 14:30:20 So the idea here at least is that you know cycling places is sort of a metabolic constraint on the system, and not necessarily a taxonomic constraint on the community. 14:30:32 Okay. I'll stop there. 14:30:36 I'm almost on time right thank you you're in time, I think, I don't know. 14:30:40 There's pressing questions. 14:30:48 This is trivial, but why doesn't clammy cycle on its own. I mean it. 14:30:53 You know perfectly capable of doing respiration. 14:30:59 Uh, yeah I don't know if I have a brilliant answer to that honestly. 14:31:04 I think one one reason would be that it is producing components. No, actually I think I do have an answer for that and I think that it's good question that that Clemmie makes carbon compounds that it cannot utilize completely right so the cell wall for 14:31:17 example of these algae is a really complicated mass and maybe Andrew knows even better. And I'm not sure that the algae can then consume that right so you need some degree leaders in the community that can then take that on and a biomass and then oxidized 14:31:33 again. 14:31:35 Short answer be somebody else has something 14:31:41 called clearly utilizing something that, that the algae are producing my response is that algae cannot utilize everything that the algae produce. That's my statement. 14:31:51 I mean, just as I mean you can make a synthetic community between yeast and clammy because the yeast in a closed system, because the climate produces co2. 14:32:04 Sorry. The East produces co2 in the climate leaks glucose which it cannot itself or Aspire only during the light phase, right in put in the dark failure continues to limit Right, right. 14:32:19 So, how should I think about death here in relation to your meta genomic data, like, Am I imagining some guys just sort of die sink to the bottom your sequencing their genomes on the first passage but not the second one. 14:32:30 Yeah, so I think that So, so, death is an important question here. 14:32:36 It's not clear to me exactly what's happening in these communities there's there's, there's two possibilities. One possibility is that the carbon is going to sloshing back and forth between photo trophy trophy. 14:32:46 And there's not a lot of cycling of the other resources. 14:32:49 And so they're satisfying the maintenance requirements by exchanging carbon, but the biomass turnover rate is dramatically Slow, slow much slower than the carbon cycling. 14:33:01 And in that case, that could be true. 14:33:04 No no I don't know that. Okay, so that's one scenario absolutely don't know that that's one scenario, the other scenario is that they're cycling all the nutrients quickly and the growth is actually pretty fast in there and then I would expect death to 14:33:14 be less important. 14:33:17 I was gonna ask Have you tried in your medicine and get it to measure this peak to trough growth rate, sort of estimate we don't have shocked on data on. 14:33:35 Yeah, no, he's crossed that method would trust it but as a way of helping to guess. Yeah. So the energy that comes in is the light. Yeah. Have you tried to think of a global energy budgeted measuring heat coming out. 14:33:44 We have not tried to take a global energy budget. 14:33:48 We and roughly estimate the quantum field of the entire ecosystem. 14:33:53 Because we know the energetics of the photosynthetic and respiratory reactions and that quantum field was about 1%. 14:34:02 You can also measure the biomass in the system the carbon, you can get a timescale from the biomass and the system and the carbon cycling rate. 14:34:11 Okay, divide those numbers appropriately and you get a timescale that's something like two weeks in five days, 20 days. 14:34:18 Well that means that the carbon cycling rate is. I would say, fast, in the sense that the number of bowls of carbon being turned over is on the order of the total biomass system doesn't mean every carbon atom is going around the carbon cycle. 14:34:30 But you can calculate these global quantities of the system. I think that's fascinating. Incidentally, the quantum field of the Earth is about one.