Team:MIT/Measurement


Measurement
Population Separation from Flow Cytometry Data
Our project is all about controlling population composition, to achieve a ratio that will produce a more robust and efficient microbial consortia. Thus it was vital to have a way of knowing both the absolute and the relative population sizes of the two species in our consortia, C. Hutchinsonii and E Coli. This would allow us to deduce how well they naturally grow together (are they co-existing at an even ratio or is one of them dominating while the other is being out-competed for resources? is the total growth higher or lower when they are together versus when they are grown separately?). This would inform us on how we should engineer the two bacteria to communicate to achieve the desired population dynamics. Distinguishing the populations would finally be necessary for checking if our genetic circuits successfully generated the desired ratio between the two populations.

The Problem

Determining which cells corresponded to E Coli and C Hutchinsonii (and consequently how large their population sizes were) was challenging, especially because initially we did not transform them with fluorescent proteins due to concerns about metabolic load. Hemacytometry could not be used because both bacteria move around too much to reliably count. Flow cytometry can be used to determine the physical properties of cells by measuring the way they scatter light as they pass through the cytometer. C. Hutchinsonii has a thinner, more rod-like shape than E Coli so we thought this would work really well.

We ran samples of just E Coli and just C Hutch through the flow cytometer, but the differences between the two bacteria weren’t significant enough to produce a clean separation- there were two overlapping populations distinguishable only by a decrease in density in a narrow region between them.

(the scatter plots of the E Coli and C Hutchinsonii overlaid on the same axes, demonstrating the separation) Because hundreds of samples had to be analyzed over the course of the summer, it was necessary to produce an automated method for reliably separating the two populations.

The Approach

Several approaches were attempted with little success.

(attempting to separate the populations using a density-based clustering algorithm)

Population separation was finally achieved by dividing the data into slices by forward scatter width and plotting a histogram for the forward scatter height of the cells in each slice. A polynomial fit was applied to the resulting histogram and the minimum between the two peaks was taken to be the boundary between the two populations.

Left: the scatter plot of the population mixture. Right: Histogram of scatter events from the red “slice” of the data on the left demonstrating the curve fit used to separate the populations. Every event before the trough for that slice (marked with a red circle) was counted as C Hutch, and after the trough was E Coli.

Unfortunately, for some of the samples there were too few cells to produce an accurate polynomial fit. On the other hand, we observed that the positions of the the two population distributions barely shifted between different samples. Therefore we took several good representative samples and combined their boundaries generated from the above algorithm to produce a boundary that could be reused for all the other samples. This also greatly reduced the time it took to analyse each sample. Another boundary was created in a similar fashion to separate the E Coli distribution from the non-living debris that also showed up in the cytometry data.

(left to right: C Hutch, E Coli, debris) In order to obtain an absolute cell count, the dilution, the length of time and the rate at which each sample flowed through the flow cytometer were combined to obtain the number of cells in a microlitre of sample solution. To check the accuracy of our measurement method (which also included other steps in the lab such as filtering the cells to remove filter paper fragments the C Hutchinsonii was growing on) we diluted samples in triplicate and found an error of about 15%, although most of this was probably due to errors in the volume of liquid used because of the filter paper fragments in the solution. This method is reliable and fast; it was able to produce population separation graphs from several hundred samples taken from the naive coculture in a matter of seconds, yielding growth curves which matched the predictions from our models.