Researchers affiliated with Nvidia and Harvard today detailed AtacWorks, a machine learning toolkit designed to bring down the cost and time needed for rare and single-cell experiments. In a study published in the journal Nature Communications, the coauthors showed that AtacWorks can run analyses on a whole genome in just half an hour compared with the multiple hours traditional methods take.
Most cells in the body carry around a complete copy of a person’s DNA, with billions of base pairs crammed into the nucleus. But an individual cell pulls out only the subsection of genetic components that it needs to function, with cell types like liver, blood, or skin cells using different genes. The regions of DNA that determine a cell’s function are easily accessible, more or less, while the rest are shielded around proteins.
AtacWorks, which is available from Nvidia’s NGC hub of GPU-optimized software, works with ATAC-seq, a method for finding open areas in the genome in cells pioneered by Harvard professor Jason Buenrostro, one of the paper’s coauthors. ATAC-seq measures the intensity of a signal at every spot on the genome. Peaks in the signal correspond to regions with DNA such that the fewer cells available, the noisier the data appears, making it difficult to identify which areas of the DNA are accessible.