Skip to content Skip to sidebar Skip to footer

Compiling And Iterating Over A Dictionary

I'm fairly new to python, and am working on building a dictionary from a file, and then iterating over the dictionary. I have been working in eclipse, and am not getting any output

Solution 1:

import re

id_to_info = {} #declare dictionarydefparse_record(term):
    go_id = re.findall(r"id:\s(.*?)\n", term, re.DOTALL)[0]
    name = re.findall(r"name:\s(.*?)\n", term, re.DOTALL)[0]
    namespace = re.findall(r"namespace:\s(.*?)\n", term, re.DOTALL)[0]
    is_a = re.findall(r'is_a:(.*)', term, re.DOTALL)[0]
    info = namespace + "\n" + name + "\n" + is_a
    id_to_info[go_id] = info
    for go_id, info in id_to_info.iteritems():
        print(go_id + "\t" + info)

defsplit_record(record):
    sp_file = open(record)
    sp_records = sp_file.read()
    sp_split_records = re.findall(r"(\[.*?)\n\n", sp_records, re.DOTALL)
    for sp_record in sp_split_records:
        parse_record(term=sp_record)
    sp_file.close()

split_record(record="go.rtf")

I would suggest NOT to use IDE, use instead terminal or at least to debug interpreter:

Python 2.7.10 (default, Jul 30 2016, 18:31:42) 
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.34)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>s = """[Term]...id: GO:0000010...name: trans-hexaprenyltranstransferase activity...namespace: molecular_function...def: "Catalysis of the reaction: all-trans-hexaprenyl diphosphate + isopentenyl diphosphate = all-trans-heptaprenyl diphosphate + diphosphate." [KEGG:R05612, RHEA:20839]...subset: gosubset_prok...xref: KEGG:R05612...xref: RHEA:20839...is_a: GO:0016765 ! transferase activity, transferring alkyl or aryl (other than methyl) groups""">>>import re>>>re.findall(r'is_a:(.*)', s)
[' GO:0016765 ! transferase activity, transferring alkyl or aryl (other than methyl) groups']

Also put lots of print, Python is dynamic, meaning it doesn't have compile and run .. it will run till it hits error.

Your problems:

1) RegEx - Google around 2) Typo - iteritems! Both you can read from Python doc. They are really good.. Or pick any book .. you'll learn a lot while writing code and experimenting on interpreter.

---Python lover!

Solution 2:

re.findall returns a list of things it found; your code assumes strings. Since you have only one hit per line, just add [0] where feasible. is_a can come back empty, so it needs a little more tender handling.

Also, the (key, value) method is iteritems (iteration items), not i n teritems.

Here's an update:

import re

id_to_info = {} #declare dictionarydefparse_record(term):
    go_id = re.findall(r"id:\s(.*?)\n", term, re.DOTALL)[0]
    name = re.findall(r"name:\s(.*?)\n", term, re.DOTALL)[0]
    namespace = re.findall(r"namespace:\s(.*?)\n", term, re.DOTALL)[0]
    is_a = re.findall(r"is_a:\s(.*?)\n", term, re.DOTALL)
    is_a = is_a[0] if is_a else""# print namespace, name, is_a
    info = namespace + "\n" + name + "\n" + is_a
    id_to_info[go_id] = info
    for go_id, info in id_to_info.iteritems():
        print(go_id + "\t" + info)

defsplit_record(record):
    sp_file = open(record)
    sp_records = sp_file.read()
    sp_split_records = re.findall(r"(\[.*?)\n\n", sp_records, re.DOTALL)
    for sp_record in sp_split_records:
        parse_record(term=sp_record)
    sp_file.close()

split_record(record="go.rtf")

Output:

GO:0000010  molecular_function
trans-hexaprenyltranstransferase activity
GO:0016765 ! transferase activity, transferring alkyl or aryl (other
GO:0000011  biological_process
vacuole inheritance
GO:0007033 ! vacuole organization
GO:0000010  molecular_function
trans-hexaprenyltranstransferase activity
GO:0016765 ! transferase activity, transferring alkyl or aryl (other
GO:0000011  biological_process
vacuole inheritance
GO:0007033 ! vacuole organization
GO:0000010  molecular_function
trans-hexaprenyltranstransferase activity
GO:0016765 ! transferase activity, transferring alkyl or aryl (other
GO:0000012  biological_process
single strand break repair

I'll leave the rest of the formatting to you. :-)

Post a Comment for "Compiling And Iterating Over A Dictionary"