Model a serial format in the type system, like Servant

不打扰是莪最后的温柔 提交于 2019-12-24 08:59:55

问题


I'm working on an API integration that ignores the existence of XML or JSON in favor of just appending character data. (The Metro2 format, if interested)

I'm simplifying, but imagine that a person needs to be serialized like this:

  • At pos 0, 4 chars: Number of bytes in the message
  • At pos 5: 6 chars: "PERSON" hard coded
  • At pos 11: 20 chars: Name, left-aligned and space-padded
  • At pos 21: 8 chars: Birthday, YYYYMMDD
  • At pos 29: 3 chars: Age, right-aligned and zero-padded

Numeric fields are always right-aligned and zero-padded. Text fields are always left-aligned and space padded.

For example:

"0032PERSONDAVID WILCOX        19820711035"

Can I express this in the type system? Like what servant does? Something like this?

newtype ByteLength = ByteLength Int
newtype Age = Age Int
-- etc

type PersonMessage
     = Field ByteLength '0
    :| Field "PERSON" '5
    :| Field Name '11
    :| Field Date '21
    :| Field Age '29

-- :| is a theoretical type operator, like :> in servant
-- the number is the expected offset
-- the length of the field is implicit in the type

Can I statically check that my implementation of the serialization matches the type?

Can I statically check that the offset of the 3rd field (Name) is 11? That the lengths of the preceding fields add up to 11? I'm assuming no, since that seems like it would require full dependent type support.

Is this on the right track?

instance ToMetro Age where
   -- get the length into the type system using a type family?
   field = Numeric '3

   -- express how this is encoded. Would need to use the length from the type family. Or if that doesn't work, put it in the constructor.
   toMetro age = Numeric age

Update: Example of a function I would like to statically validate:

personToMetro :: Person -> PersonMessage
personToMetro p = error "Make sure that what I return is a PersonMessage"

回答1:


Just to give you some inspiration, just do what Servant does and have different types for the different combinators you support:

{-# LANGUAGE GADTs, DataKinds, KindSignatures, TypeOperators, ScopedTypeVariables #-}

module Seriavant where

import GHC.TypeLits
import Data.Proxy
import Data.List (stripPrefix)

data Skip (n :: Nat) = Skip deriving Show
data Token (n :: Nat) = Token String deriving Show
data Lit (s :: Symbol) = Lit deriving Show

data (:>>) a b = a :>> b deriving Show
infixr :>>

class Deserialize a where
    deserialize :: String -> Maybe (a, String)

instance (KnownNat n) => Deserialize (Skip n) where
    deserialize s = do
        (_, s') <- trySplit (natVal (Proxy :: Proxy n)) s
        return (Skip, s')

instance (KnownNat n) => Deserialize (Token n) where
    deserialize s = do
        (t, s') <- trySplit (natVal (Proxy :: Proxy n)) s
        return (Token t, s')

instance (KnownSymbol lit) => Deserialize (Lit lit) where
    deserialize s = do
        s' <- stripPrefix (symbolVal (Proxy :: Proxy lit)) s
        return (Lit, s')

instance (Deserialize a, Deserialize b) => Deserialize (a :>> b) where
    deserialize s = do
        (x, s') <- deserialize s
        (y, s'') <- deserialize s'
        return (x :>> y, s'')

trySplit :: Integer -> [a] -> Maybe ([a], [a])
trySplit 0 xs = return ([], xs)
trySplit n (x:xs) = do
    (xs', ys) <- trySplit (n-1) xs
    return (x:xs', ys)
trySplit _ _ = Nothing

Yeah so this is quite spartan, but it already allows you to do

type MyFormat = Token 4 :>> Lit "PERSON" :>> Skip 1 :>> Token 4

testDeserialize :: String -> Maybe MyFormat
testDeserialize = fmap fst . deserialize

which works like this:

*Seriavant> testDeserialize "1"
Nothing
*Seriavant> testDeserialize "1234PERSON Foo "
Just (Token "1234" :>> (Lit :>> (Skip :>> Token "Foo ")))

EDIT: Turns out I completely misread the question, and Sean is asking for serialization, not deserialization... But of course we can do that as well:

class Serialize a where
    serialize :: a -> String

instance (KnownNat n) => Serialize (Skip n) where
    serialize Skip = replicate (fromIntegral $ natVal (Proxy :: Proxy n)) ' '

instance (KnownNat n) => Serialize (Token n) where
    serialize (Token t) = pad (fromIntegral $ natVal (Proxy :: Proxy n)) ' ' t

instance (KnownSymbol lit) => Serialize (Lit lit) where
    serialize Lit = symbolVal (Proxy :: Proxy lit)

instance (Serialize a, Serialize b) => Serialize (a :>> b) where
    serialize (x :>> y) = serialize x ++ serialize y

pad :: Int -> a -> [a] -> [a]
pad 0 _x0 xs = xs
pad n x0 (x:xs) = x : pad (n-1) x0 xs
pad n x0 [] = replicate n x0

(of course this has horrible performance with all this String concatenation etc. but that's not the point here)

*Seriavant> serialize ((Token "1234" :: Token 4) :>> (Lit :: Lit "FOO") :>> (Skip :: Skip 2) :>> (Token "Bar" :: Token 10))
"1234FOO  Bar       "

Of course, if we know the format, we can avoid those pesky type annotations:

type MyFormat = Token 4 :>> Lit "PERSON" :>> Skip 1 :>> Token 4

testSerialize :: MyFormat -> String
testSerialize = serialize
*Seriavant> testSerialize (Token "1234" :>> Lit :>> Skip :>> Token "Bar")
"1234PERSON Bar "


来源:https://stackoverflow.com/questions/40667682/model-a-serial-format-in-the-type-system-like-servant

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!