Skip to content Skip to sidebar Skip to footer

Pandas Read Sql Integer Became Float

I met a problem that when I use pandas to read Mysql table, some columns (see 'to_nlc') used to be integer became a float number (automatically add .0 after that). Can anyone figur

Solution 1:

Problem is your data contains NaN values, so int is automatically cast to float.

I think you can check NA type promotions:

When introducing NAs into an existing Series or DataFrame via reindex or some other means, boolean and integer types will be promoted to a different dtype in order to store the NAs. These are summarized by this table:

Typeclass   Promotion dtype for storing NAs
floating    no change
object      no change
integer     cast to float64
boolean     cast to object

While this may seem like a heavy trade-off, in practice I have found very few cases where this is an issue in practice. Some explanation for the motivation here in the next section.


Solution 2:

As already said the problem is that pandas' integer can not handle NULL/NA value.

You can replace read_sql_table with read_sql and convert NULL to some integer value (for example 0 or -1, something which has NULL sense in your setting):

df = pandas.read_sql("SELECT col1, col2, IFNULL(col3, 0) FROM table", engine)

Here col3 can be NULL in mysql, ifnull will return 0 if it is NULL or col3 value otherwise.

Or same thing with little function helper:

def read_sql_table_with_nullcast(table_name, engine, null_cast={}):
    """
    table_name - table name
    engine - sql engine
    null_cast - dictionary of columns to replace NULL:
           column name as key value to replace with as value.
           for example {'col3':0} will set all NULL in col3 to 0
    """
    import pandas
    cols = pandas.read_sql("SHOW COLUMNS FROM " + table_name, engine)
    cols_call = [c if c not in null_cast else "ifnull(%s,%d) as %s"%(c,null_cast[c],c) for c in cols['Field']]
    sel = ",".join(cols_call)
    return pandas.read_sql("SELECT " + sel + " FROM " + table_name, engine)

read_sql_table_with_nullcast("table", engine, {'col3':0})

Post a Comment for "Pandas Read Sql Integer Became Float"