It’s easy to create data sets using a text editor or a spreadsheet utility. With the latest version of these utilities, you can even make them up to a million rows. Once the data is created, it needs to be loaded into a database table or tables so that is can be utilized. Oracle has a couple of utilities for loading data from flat file – SQL Loader and external tables. SQL Loader can be a bit complicated to work with, so I prefer external tables for quickly loading data from flat files. Here’s a quick method I recently used.
I created a .csv file with 1048576 rows and 25 columns. Here is the first comma-delimited record. You can copy and paste it into a spreadsheet or a text file over and over, creating as any records as you like:
123456,This is a test,3.14,14-Dec-14,http://www.msn.com/en-us/autos/classic-cars/the-derelict-1952-chrysler-town-and-country-beauty-in-the-beast/ar-BBt2Y1e,123456,This is a test,3.14,14-Dec-14,http://www.msn.com/en-us/autos/classic-cars/the-derelict-1952-chrysler-town-and-country-beauty-in-the-beast/ar-BBt2Y1e,123456,This is a test,3.14,14-Dec-14,http://www.msn.com/en-us/autos/classic-cars/the-derelict-1952-chrysler-town-and-country-beauty-in-the-beast/ar-BBt2Y1e,123456,This is a test,3.14,14-Dec-14,http://www.msn.com/en-us/autos/classic-cars/the-derelict-1952-chrysler-town-and-country-beauty-in-the-beast/ar-BBt2Y1e
This is the table I created to hold the data in the Oracle 11.2.0.1 database (please excuse the lack of creative column names):
create table jontest.jontab( ID1 number, LABEL1 varchar2(15), PI1 number, DATE1 date, URL1 varchar2(500), ID2 number, LABEL2 varchar2(15), PI2 number, DATE2 date, URL2 varchar2(500), ID3 number, LABEL3 varchar2(15), PI3 number, DATE3 date, URL3 varchar2(500), ID4 number, LABEL4 varchar2(15), PI4 number, DATE4 date, URL4 varchar2(500), ID5 number, LABEL5 varchar2(15), PI5 number, DATE5 date, URL5 varchar2(500) );
Now that the target has been created, we need to set up the system for the external table process. We need to create a directory object in the database using a directory where the oracle software account has read and write access:
create directory impdir as ‘/home/oracle’;
Make sure the table owner has read and write permission on this directory object:
grant read, write on directory impdir to jontest;
Place the input .csv file (called ironically input.csv) in the directory object location and ensure the ownership and permissions on the file are appropriate.
Next, create the external table to access the input file. An external table is not actually a physical table in the database, but more like a view that allows access to the data in the file. Note that this table was created as the ‘jontest’ user.
CREATE TABLE extload ( ID1 char(6), LABEL1 char(15), PI1 char(4), DATE1 date, URL1 char(500), ID2 char(6), LABEL2 char(15), PI2 char(4), DATE2 date, URL2 char(500), ID3 char(6), LABEL3 char(15), PI3 char(4), DATE3 date, URL3 char(500), ID4 char(6), LABEL4 char(15), PI4 char(4), DATE4 date, URL4 char(500), ID5 char(6), LABEL5 char(15), PI5 char(4), DATE5 date, URL5 char(500) ) ORGANIZATION EXTERNAL ( TYPE ORACLE_LOADER DEFAULT DIRECTORY impdir ACCESS PARAMETERS ( RECORDS DELIMITED BY NEWLINE FIELDS TERMINATED BY ‘,’ MISSING FIELD VALUES ARE NULL ( ID1 char(6), LABEL1 char(15), PI1 char(4), DATE1 char(10) date_format DATE mask “DD-MON-YY”, URL1 char(500), ID2 char(6), LABEL2 char(15), PI2 char(4), DATE2 char(10) date_format DATE mask “DD-MON-YY”, URL2 char(500), ID3 char(6), LABEL3 char(15), PI3 char(4), DATE3 char(10) date_format DATE mask “DD-MON-YY”, URL3 char(500), ID4 char(6), LABEL4 char(15), PI4 char(4), DATE4 char(10) date_format DATE mask “DD-MON-YY”, URL4 char(500), ID5 char(6), LABEL5 char(15), PI5 char(4), DATE5 char(10) date_format DATE mask “DD-MON-YY”, URL5 char(500) ) ) LOCATION (‘input.csv’) ) reject limit unlimited;
I used the same column names as my target table, again lacking any creativity in this area. Notice that the column data types are either ‘char’ or ‘date’. This just makes thing easier and the data can be transformed during the loading into the target table, as I will demonstrate. You can already see in the second listing of the columns in this table that I am already formatting the date columns.
If the creation is successful, you should be able to perform a count on the number of records in the table, or select one of the column values. If the count or select comes back with no records or errors, check the directory object location for .bad or .log files with the table name as the prefix of the file name. These files are created by default even though we did not specify them in the external table creation statement.
Once you can select from the external table, you can treat it like any other read-only table. We can now perform any data conversion while we load our target table.
insert into jontest.jontab select to_number(ID1,999999), LABEL1, to_number(PI1,9.99), to_date(DATE1,’DD-MON-YY’), URL1, to_number(ID2,999999), LABEL2, to_number(PI2,9.99), to_date(DATE2,’DD-MON-YY’), URL2, to_number(ID3,999999), LABEL3, to_number(PI3,9.99), to_date(DATE3,’DD-MON-YY’), URL3, to_number(ID4,999999), LABEL4, to_number(PI4,9.99), to_date(DATE4,’DD-MON-YY’), URL4, to_number(ID5,999999), LABEL5, to_number(PI5,9.99), to_date(DATE5,’DD-MON-YY’), URL5 from jontest.extload;