summaryrefslogtreecommitdiffstats
path: root/sqlglot/dataframe/README.md
diff options
context:
space:
mode:
Diffstat (limited to 'sqlglot/dataframe/README.md')
-rw-r--r--sqlglot/dataframe/README.md18
1 files changed, 11 insertions, 7 deletions
diff --git a/sqlglot/dataframe/README.md b/sqlglot/dataframe/README.md
index 02179f4..86fdc4b 100644
--- a/sqlglot/dataframe/README.md
+++ b/sqlglot/dataframe/README.md
@@ -9,7 +9,7 @@ Currently many of the common operations are covered and more functionality will
## Instructions
* [Install SQLGlot](https://github.com/tobymao/sqlglot/blob/main/README.md#install) and that is all that is required to just generate SQL. [The examples](#examples) show generating SQL and then executing that SQL on a specific engine and that will require that engine's client library.
* Find/replace all `from pyspark.sql` with `from sqlglot.dataframe`.
-* Prior to any `spark.read.table` or `spark.table` run `sqlglot.schema.add_table('<table_name>', <column_structure>)`.
+* Prior to any `spark.read.table` or `spark.table` run `sqlglot.schema.add_table('<table_name>', <column_structure>, dialect="spark")`.
* The column structure can be defined the following ways:
* Dictionary where the keys are column names and values are string of the Spark SQL type name.
* Ex: `{'cola': 'string', 'colb': 'int'}`
@@ -33,12 +33,16 @@ import sqlglot
from sqlglot.dataframe.sql.session import SparkSession
from sqlglot.dataframe.sql import functions as F
-sqlglot.schema.add_table('employee', {
- 'employee_id': 'INT',
- 'fname': 'STRING',
- 'lname': 'STRING',
- 'age': 'INT',
-}) # Register the table structure prior to reading from the table
+sqlglot.schema.add_table(
+ 'employee',
+ {
+ 'employee_id': 'INT',
+ 'fname': 'STRING',
+ 'lname': 'STRING',
+ 'age': 'INT',
+ },
+ dialect="spark",
+) # Register the table structure prior to reading from the table
spark = SparkSession()