Possible Parser For Unknown String Format(soup?) From Suds.client
I am using suds package to query a API from a website, the data returned from their website looks like this,: (1). Can anyone tell me what kind of format is this? (2). If so, what
Solution 1:
This looks like some kind of nested repr output, similar to JSON but with structure or object name information ("a Status contains a message and a code"). If it's nested, regexes alone won't do the job. Here is a rough pass at a pyparsing parser
sample = """
... given sample text ...
"""from pyparsing import *
# punctuation
LPAR,RPAR,LBRACE,RBRACE,LBRACK,RBRACK,COMMA,EQ = map(Suppress,"(){}[],=")
identifier = Word(alphas,alphanums+"_")
# define some types that can get converted to Python types# (parse actions will do conversion at parse time)
NONE = Keyword("None").setParseAction(replaceWith(None))
integer = Word(nums).setParseAction(lambda t:int(t[0]))
quotedString.setParseAction(removeQuotes)
# define a placeholder for a nested object definition (since objDefn# will be referenced within its own definition)
objDefn = Forward()
objType = Combine(LPAR + identifier + RPAR)
objval = quotedString | NONE | integer | Group(objDefn)
objattr = Group(identifier + EQ + objval)
arrayattr = Group(identifier + LBRACK + RBRACK + EQ + Group(OneOrMore(Group(objDefn)+COMMA)) )
# use '<<' operator to assign content to previously declared Forward
objDefn << objType + LBRACE + ZeroOrMore((arrayattr | objattr) + Optional(COMMA)) + RBRACE
# parse sample text
result = objDefn.parseString(sample)
# use pprint to list out indented parsed dataimport pprint
pprint.pprint(result.asList())
Prints:
['DetailResult',
['status', ['Status', ['message', None], ['code', '0']]],
['searchArgument',
['DetailSearchArgument',
['reqPartNumber', 'BQ'],
['reqMfg', 'T'],
['reqCpn', None]]],
['detailsDto',
[['DetailsDto',
['summaryDto',
['SummaryDto',
['PartNumber', 'BQ'],
['seMfg', 'T'],
['description', 'Fast']]],
['packageDto',
[['PackageDto', ['fetName', 'a'], ['fetValue', 'b']],
['PackageDto', ['fetName', 'c'], ['fetValue', 'd']],
['PackageDto', ['fetName', 'd'], ['fetValue', 'z']],
['PackageDto', ['fetName', 'f'], ['fetValue', 'Sq']],
['PackageDto', ['fetName', 'g'], ['fetValue', 'p']]]],
['additionalDetailsDto',
['AdditionalDetailsDto',
['cr', None],
['pOptions', None],
['inv', None],
['pcns', None]]],
['partImageDto', None],
['riskDto',
['RiskDto',
['life', 'Low'],
['lStage', 'Mature'],
['yteol', '10'],
['Date', '2023']]],
['partOptionsDto',
[['ReplacementDto',
['partNumber', 'BQ2'],
['manufacturer', 'T'],
['type', 'Reel']]]],
['inventoryDto',
[['InventoryDto',
['distributor', 'V'],
['quantity', '88'],
['buyNowLink', 'https://www...']],
['InventoryDto',
['distributor', 'R'],
['quantity', '7'],
['buyNowLink', 'http://www.r.']],
['InventoryDto',
['distributor', 'RS'],
['quantity', '2'],
['buyNowLink', 'http://www.rs..']]]]]]]]
Post a Comment for "Possible Parser For Unknown String Format(soup?) From Suds.client"