Hey,
As part of my research in genetics, I need to work with large text files that displease excel and so I am trying to work out how to do it via terminal. Unfortunately, my terminal skills are nigh on zero and I don't have time to wait around while I learn it from scratch so I wondered if anyone might be able to help...
Essentially, I have two files. The first has only one column (a list of genes). The second has many columns (of which the third column is also a list of genes) and many thousands of rows.
What I want to do is compare column 1 in file 1 with column 3 in file 2 and extract the whole row corresponding to items that match into a new file.
By way of example...
if I had the following two files...
File 1:
gene1
gene5
gene3
File 2:
ant banana gene2 bee honey horse
red green gene1 purple yellow gold
one three gene3 seven two two
elf gnome gene7 fairy ork wizard
beans chips gene10 steak sausage eggs
then the output I would want would be...
red green gene1 purple yellow gold
one three gene3 seven two two
If anybody can tell me how to do that I would be very grateful...
As part of my research in genetics, I need to work with large text files that displease excel and so I am trying to work out how to do it via terminal. Unfortunately, my terminal skills are nigh on zero and I don't have time to wait around while I learn it from scratch so I wondered if anyone might be able to help...
Essentially, I have two files. The first has only one column (a list of genes). The second has many columns (of which the third column is also a list of genes) and many thousands of rows.
What I want to do is compare column 1 in file 1 with column 3 in file 2 and extract the whole row corresponding to items that match into a new file.
By way of example...
if I had the following two files...
File 1:
gene1
gene5
gene3
File 2:
ant banana gene2 bee honey horse
red green gene1 purple yellow gold
one three gene3 seven two two
elf gnome gene7 fairy ork wizard
beans chips gene10 steak sausage eggs
then the output I would want would be...
red green gene1 purple yellow gold
one three gene3 seven two two
If anybody can tell me how to do that I would be very grateful...