Difference Between Numeric and Float in Bigquery

What is the difference between NUMERIC and FLOAT in BigQuery?

I like the current answers. I want to add this as a proof of why NUMERIC is necessary:

SELECT 
4.35 * 100 a_float
, CAST(4.35 AS NUMERIC) * 100 a_numeric

Sample Image

This is not a bug - this is exactly how the IEEE defines floats should be handled. Meanwhile NUMERIC exhibits behavior closer to what humans expect.

For another proof of NUMERIC usefulness, this answer shows how NUMERIC can handle numbers too big for JavaScript to normally handle.

Before you blame BigQuery for this problem, you can check that most other programming languages will do the same. Python, for example:

Sample Image

Avoid exponent in float data

For your requirement, you can SAFE_CAST to NUMERIC as numeric values can contain fractional components with exact precise values but FLOAT64 is an approximate numeric data type which gives approximate values with decimal or fractional components.

Float and float-related SQL numeric data types hold approximate numeric values. They consist of a significant (a signed numeric value) and an exponent (a signed integer that specifies the magnitude of the significant).

Since NUMERIC values are exact precise values, you can try the below code to convert float value to numeric to avoid exponents. And if you have a bigger number then you can use BIGNUMERIC instead of NUMERIC.

SELECT SAFE_CAST(FORMAT('%.2f', abc) AS NUMERIC) AS deduction FROM table

How to read and output numeric values properly in BigQuery?

This is due to BigQuery not understanding the localized format you're using for the numeric values. It expects the period (.) character for the decimal separator.

If you can't deal with this early in the process that produces the CSV files in BigQuery, another strategy is to instead use a string type for the columns, and then do some manipulation.

Here's a simple conversion example that shows some string manipulation and casting to get to the desired type. If you're using both commas and periods as part of the localized format, you'll need a more complex string manipulation.

WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)

SELECT
A,
CAST(REPLACE(A,",",".") AS FLOAT64) as A_as_float64,
CAST(CAST(REPLACE(A,",",".") AS FLOAT64) AS INT64) as A_as_int64
FROM
sample_row

You could also generalize this as a user defined function (temporary or persisted) to make it easier to reuse:

CREATE TEMPORARY FUNCTION parseAsFloat(instr STRING) AS (CAST(REPLACE(instr,",",".") AS FLOAT64));

WITH
sample_row AS (
SELECT "4000,0000000000000" as A, "15400000,000" as B,"12311918,400000" as C,"3088081,600" as D
)

SELECT
CAST(parseAsFloat(A) AS INT64) as A,
parseAsFloat(B) as B,
parseAsFloat(C) as C,
parseAsFloat(D) as D,
FROM
sample_row

How to get difference in hours between two datetime in floating or decimal in bigquery

Expected Result:- 0:49:05

Below is for BigQuery Standard SQL

  SELECT FORMAT('%i%s', 
DIV(DATETIME_DIFF(date1, date2, MINUTE), 60),
FORMAT_TIME(':%M:%S', TIME(DATETIME_ADD(DATETIME(TIMESTAMP(DATE(1970, 1, 1))), INTERVAL DATETIME_DIFF(date1, date2, SECOND) SECOND)))
) AS diff

For DATETIME '2019-07-05 17:42:06' date1, DATETIME '2019-07-02 15:53:01' date2 it gives

diff     
73:49:05

For DATETIME '2019-07-02 17:42:06' date1, DATETIME '2019-07-02 15:53:01' date2 result is

diff     
1:49:05

For DATETIME '2019-07-02 16:42:06' date1, DATETIME '2019-07-02 15:53:01' date2 (as in your question) result is

diff     
0:49:05

If you know that the difference will be within 24 hours you can use simpler statement as below

FORMAT_TIME('%T', TIME(DATETIME_ADD(DATETIME(TIMESTAMP(DATE(1970, 1, 1))), INTERVAL DATETIME_DIFF(date1, date2, SECOND) SECOND)))   

For last two examples result will be respectively 01:49:05 and 00:49:05

BigQuery: Numeric DataType: Cannot store more that 19 digits

If you use just 12345678901234567890 BigQUery consider this as a INT64 data type thus an error

You need somehow to tell BQ engine that this is not an integer but rather float or numeric

The simplest way to do so is to use 12345678901234567890.0 - in this case BQ will treat this as FLOAT64. If you need make sure it is NUMERIC you can for example explicitly cast it CAST(12345678901234567890.0 as NUMERIC)

See also example below

#standardSQL
SELECT
12345678901234567890.0 float_value_a,
CAST(12345678901234567890.0 AS NUMERIC) numeric_value_b,
CAST('12345678901234567890' AS NUMERIC) numeric_value_c,
CAST('12345678901234567890' AS FlOAT64) float_value_d

with result

Row float_value_a           numeric_value_b         numeric_value_c         float_value_d    
1 1.2345678901234567E19 12345678901234567890 12345678901234567890 1.2345678901234567E19

How to set the precision of a FLOAT in BigQuery schema

No data transformation can be performed when you upload data into BigQuery.

In general you can try some other methods :

Create a new table from uploaded table. For this you can create a query and can perform required transformation in it.
eg: CREATE TABLE project_name.dataset_name.new_table_name as SELECT Brand, round(price,5) as Price from project_name.dataset_name.uploaded_table_name;

Creating a table from a query result. https://cloud.google.com/bigquery/docs/tables#creating_a_table_from_a_query_result

Weird decimal values obtained through SQL type casting

The division result is floating point value, you need to it to NUMERIC type:

SELECT
CAST(5 AS FLOAT64) / (1 - 0.8) AS float_original_price,
CAST(CAST(5 AS FLOAT64) / (1 - 0.8) AS NUMERIC) AS numeric_original_price


Leave a reply



Submit