get datatype of column using pyspark

We are reading data from MongoDB Collection. Collection column has two different values (e.g.: (bson.Int64,int) (int,float) ).

I am trying to get a datatype using pyspark.

My problem is some columns have different datatype.

Assume quantity and weight are the columns

quantity           weight
---------          --------
12300              656
123566000000       789.6767
1238               56.22
345                23
345566677777789    21

Actually we didn't defined data type for any column of mongo collection.

When I query to the count from pyspark dataframe

dataframe.count()

I got exception like this

"Cannot cast STRING into a DoubleType (value: BsonString{value=&apos;200.0&apos;})"

Your question is broad, thus my answer will also be broad.

To get the data types of your DataFrame columns, you can use dtypes i.e :

>>> df.dtypes
[('age', 'int'), ('name', 'string')]

This means your column age is of type int and name is of type string.

I don't know how are you reading from mongodb, but if you are using the mongodb connector, the datatypes will be automatically converted to spark types. To get the spark sql types, just use schema atribute like this: