Parsing GPS receiver output via regex in Python

后端未结

关注

 7  1302

I have a friend who is finishing up his masters degree in aerospace engineering. For his final project, he is on a small team tasked with writing a program for tracking weat

相关标签:

7条回答

温柔的废话

2020-12-14 11:31

Those are comma separated values, so using a csv library is the easiest solution.

I threw that sample data you have into /var/tmp/sampledata, then I did this:

>>> import csv
>>> for line in csv.reader(open('/var/tmp/sampledata')):
...   print line
['$GPRMC', '092204.999', '**4250.5589', 'S', '14718.5084', 'E**', '1', '12', '24.4', '**89.6**', 'M', '', '', '0000\\*1F']
['$GPRMC', '093345.679', '**4234.7899', 'N', '11344.2567', 'W**', '3', '02', '24.5', '**1000.23**', 'M', '', '', '0000\\*1F']
['$GPRMC', '044584.936', '**1276.5539', 'N', '88734.1543', 'E**', '2', '04', '33.5', '**600.323**', 'M', '', '', '\\*00']
['$GPRMC', '199304.973', '**3248.7780', 'N', '11355.7832', 'W**', '1', '06', '02.2', '**25722.5**', 'M', '', '', '\\*00']
['$GPRMC', '066487.954', '**4572.0089', 'S', '45572.3345', 'W**', '3', '09', '15.0', '**35000.00**', 'M', '', '', '\\*1F']

You can then process the data however you wish. It looks a little odd with the '**' at the start and end of some of the values, you might want to strip that stuff off, you can do:

>> eastwest = 'E**'
>> eastwest = eastwest.strip('*')
>> print eastwest
E

You will have to cast some values as floats. So for example, the 3rd value on the first line of sample data is:

>> data = '**4250.5589'
>> print float(data.strip('*'))
4250.5589

0 讨论(0)

无人及你

2020-12-14 11:31

If you need to do some more extensive analysis of your GPS data streams, here is a pyparsing solution that breaks up your data into named data fields. I extracted your pastebin'ned data to a file gpsstream.txt, and parsed it with the following:

"""
 Parse NMEA 0183 codes for GPS data
 http://en.wikipedia.org/wiki/NMEA_0183

 (data formats from http://www.gpsinformation.org/dale/nmea.htm)
"""
from pyparsing import *

lead = "$"
code = Word(alphas.upper(),exact=5)
end = "*"
COMMA = Suppress(',')
cksum = Word(hexnums,exact=2).setParseAction(lambda t:int(t[0],16))

# define basic data value forms, and attach conversion actions
word = Word(alphanums)
N,S,E,W = map(Keyword,"NSEW")
integer = Regex(r"-?\d+").setParseAction(lambda t:int(t[0]))
real = Regex(r"-?\d+\.\d*").setParseAction(lambda t:float(t[0]))
timestamp = Regex(r"\d{2}\d{2}\d{2}\.\d+")
timestamp.setParseAction(lambda t: t[0][:2]+':'+t[0][2:4]+':'+t[0][4:])
def lonlatConversion(t):
    t["deg"] = int(t.deg)
    t["min"] = float(t.min)
    t["value"] = ((t.deg + t.min/60.0) 
                    * {'N':1,'S':-1,'':1}[t.ns] 
                    * {'E':1,'W':-1,'':1}[t.ew])
lat = Regex(r"(?P<deg>\d{2})(?P<min>\d{2}\.\d+),(?P<ns>[NS])").setParseAction(lonlatConversion)
lon = Regex(r"(?P<deg>\d{3})(?P<min>\d{2}\.\d+),(?P<ew>[EW])").setParseAction(lonlatConversion)

# define expression for a complete data record
value = timestamp | Group(lon) | Group(lat) | real | integer | N | S | E | W | word
item = lead + code("code") + COMMA + delimitedList(Optional(value,None))("datafields") + end + cksum("cksum")


def parseGGA(tokens):
    keys = "time lat lon qual numsats horiz_dilut alt _ geoid_ht _ last_update_secs stnid".split()
    for k,v in zip(keys, tokens.datafields):
        if k != '_':
            tokens[k] = v
    #~ print tokens.dump()

def parseGSA(tokens):
    keys = "auto_manual _3dfix prn prn prn prn prn prn prn prn prn prn prn prn pdop hdop vdop".split()
    tokens["prn"] = []
    for k,v in zip(keys, tokens.datafields):
        if k != 'prn':
            tokens[k] = v
        else:
            if v is not None:
                tokens[k].append(v)
    #~ print tokens.dump()

def parseRMC(tokens):
    keys = "time active_void lat lon speed track_angle date mag_var _ signal_integrity".split()
    for k,v in zip(keys, tokens.datafields):
        if k != '_':
            if k == 'date' and v is not None:
                v = "%06d" % v
                tokens[k] = '20%s/%s/%s' % (v[4:],v[2:4],v[:2])
            else:
                tokens[k] = v
    #~ print tokens.dump()


# process sample data
data = open("gpsstream.txt").read().expandtabs()

count = 0
for i,s,e in item.scanString(data):
    # use checksum to validate input 
    linebody = data[s+1:e-3]
    checksum = reduce(lambda a,b:a^b, map(ord, linebody))
    if i.cksum != checksum:
        continue
    count += 1

    # parse out specific data fields, depending on code field
    fn = {'GPGGA' : parseGGA, 
          'GPGSA' : parseGSA,
          'GPRMC' : parseRMC,}[i.code]
    fn(i)

    # print out time/position/speed values
    if i.code == 'GPRMC':
        print "%s %8.3f %8.3f %4d" % (i.time, i.lat.value, i.lon.value, i.speed or 0) 


print count

The $GPRMC records in your pastebin don't seem to quite match with the ones you included in your post, but you should be able to adjust this example as necessary.

0 讨论(0)

南旧

2020-12-14 11:31
I suggest a small fix in your code because if used to parse data from the previous century the date looks like sometime in the future (for instance 2094 instead of 1994)

My fix is not fully accurate, but I take the stand that prior to the 70's no GPS data existed.

In the def parse function for RMC sentences just replace the format line by:
```
p = int(v[4:])
print "p = ", p
if p > 70:
    tokens[k] = '19%s/%s/%s' % (v[4:],v[2:4],v[:2])
else:
    tokens[k] = '20%s/%s/%s' % (v[4:],v[2:4],v[:2])
```
This will look at the two yy digits of the year and assume that past year 70 we are dealing with sentences from the previous century. It could be better done by comparing to today's date and assuming that every time you deal with some data in the future, they are in fact from the past century

Thanks for all the pieces of code your provided above... I had some fun with this.
0 讨论(0)
发布评论:

提交评论
- 加载中...

不思量自难忘°

2020-12-14 11:42

You could use a library like pynmea2 for parsing the NMEA log.

>>> import pynmea2
>>> msg = pynmea2.parse('$GPGGA,142927.829,2831.4705,N,08041.0067,W,1,07,1.0,7.9,M,-31.2,M,0.0,0000*4F')
>>> msg.timestamp, msg.latitude, msg.longitude, msg.altitude
(datetime.time(14, 29, 27), 28.524508333333333, -80.683445, 7.9)

Disclaimer: I am the author of pynmea2

0 讨论(0)

执念已碎

2020-12-14 11:46

It's simpler to use split than a regex.

>>> line="$GPRMC,092204.999,4250.5589,S,14718.5084,E,1,12,24.4,89.6,M,,,0000*1F "
>>> line.split(',')
['$GPRMC', '092204.999', '4250.5589', 'S', '14718.5084', 'E', '1', '12', '24.4', '89.6', 'M', '', '', '0000*1F ']
>>>

0 讨论(0)

名媛妹妹

2020-12-14 11:49

splitting should do the trick. Here's a good way to extract the data, as well:

>>> line = "$GPRMC,199304.973,3248.7780,N,11355.7832,W,1,06,02.2,25722.5,M,,,*00"
>>> line = line.split(",")
>>> neededData = (float(line[2]), line[3], float(line[4]), line[5], float(line[9]))
>>> print neededData
(3248.7779999999998, 'N', 11355.7832, 'W', 25722.5)

0 讨论(0)

1 2 下一页