Doesn’t matter who much testing you do (well, it actually does but that’s a whole different issue!) you can guarantee that at some point your beautiful code that parses data input from a web form or loads data from some external file will pop up with the error:
SQL Error: ORA-01722: invalid number
01722. 00000 - "invalid number"
*Cause: The specified number was invalid.
*Action: Specify a valid number.
Of course, what’s is really annoying at this point is that you don’t know which column value of the record failed (assuming that you have more than one numeric column)
For this tutorial let’s build some simple staging and target tables in our schema:
CREATE TABLE STAGING_EMP
( "EMPNO" VARCHAR2(6),
"ENAME" VARCHAR2(10),
"JOB" VARCHAR2(9),
"MGR" VARCHAR2(4),
"HIREDATE" VARCHAR2(10),
"SAL" VARCHAR2(7),
"COMM" VARCHAR2(9),
"DEPTNO" VARCHAR2(6));
Now let's insert some data into the staging table
-- INSERTING DATA INTO STAGING_EMP
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('GB9369','SMITH','CLERK','7902','17-DEC-80','800',null,'20');
-- INVALID DATE
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('9499','ALLEN','SALESMAN','7698','31-FEB-81','1600','300','30');
-- INVALID NUMBER FOR DEPTNO
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('9521','WARD','SALESMAN','7698','22-FEB-81','1250','500','SALES');
-- INVALID NUMBER FOR EMPNO KEY
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('US9566','JONES','MANAGER','7839','02-APR-81','2975',null,'20');
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('9782','CLARK','MANAGER','7839','09-JUN-81','2450',null,'10');
-- INVALID NUMBER FOR EMPNO KEY
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('FR9788','SCOTT','ANALYST','7566','19-APR-87','3000',null,'20');
-- INVALID NUMBER FOR MGR KEY
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('9839','KING','PRESIDENT',null,'17-NOV-81','5000',null,'10');
-- INVALID NUMBER FOR EMPNO KEY
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('DE9844','TURNER','SALESMAN','7698','08-SEP-81','1500',0,'30');
Insert into STAGING_EMP (EMPNO,ENAME,JOB,MGR,HIREDATE,SAL,COMM,DEPTNO) values ('9876','ADAMS','CLERK','7788','23-MAY-87','1100',null,'20');
Now we have a data set that we can try to load into our target table
Now let’s try inserting the data from our staging table into the EMP table and see what happens:
INSERT INTO emp SELECT * FROM staging_emp;
… and not surprisingly I get the following error:
SQL Error: ORA-01722: invalid number
because the data in the staging table contains values that fail the conversion. But which values (rows/columns) caused the error?
We can deal with this situation in a couple of different ways. Firstly let’s try and discover which rows and columns in my staging table contain values that are likely to cause data conversion errors.
To do this I am going to use the new VALIDATE_CONVERSION() function which identifies problem data that cannot be converted to the required data type. It returns 1 if a given expression can be converted to the specified data type, else it returns 0.
SELECT
VALIDATE_CONVERSION(empno AS NUMBER) AS is_empno,
VALIDATE_CONVERSION(mgr AS NUMBER) AS is_mgr,
VALIDATE_CONVERSION(hiredate AS DATE) AS is_hiredate,
VALIDATE_CONVERSION(sal AS NUMBER) AS is_sal,
VALIDATE_CONVERSION(comm AS NUMBER) AS is_comm,
VALIDATE_CONVERSION(deptno AS NUMBER) AS is_deptno
FROM staging_emp;
the output from the above query enables us to easily pick out the rows where the data conversion is going to succeed (column value is 1) and fail (column value is 0).
We can use this information to filter the data in my staging table as we insert it into the EMP table or we can use the enhanced CAST and TO_xxx functions within the INSERT INTO ….. SELECT statements.
We can use this information to filter the data in our staging table as I insert it into the EMP table or we could use the enhanced CAST and TO_xxx functions within the INSERT INTO ….. SELECT statements.
The CAST function (along with TO_NUMBER, TO_BINARY_FLOAT, TO_BINARY_DOUBLE, TO_DATE, TO_TIMESTAMP, TO_TIMESTAMP_TZ, TO_DSINTERVAL, and TO_YMINTERVAL functions) can now return a user-specified value, instead of an error, when data type conversion errors occur. This reduces failures during an data transformation and data loading processes.
Therefore, the new 12.2 self-validating SELECT statement looks like this:
INSERT INTO emp
SELECT
empno,
ename,
job,
CAST(mgr AS NUMBER DEFAULT 9999 ON CONVERSION ERROR),
CAST(hiredate AS DATE DEFAULT sysdate ON CONVERSION ERROR),
CAST(sal AS NUMBER DEFAULT 0 ON CONVERSION ERROR),
CAST(comm AS NUMBER DEFAULT null ON CONVERSION ERROR),
CAST(deptno AS NUMBER DEFAULT 99 ON CONVERSION ERROR)
FROM staging_emp
WHERE VALIDATE_CONVERSION(empno AS NUMBER) = 1;
We now have five rows added to our EMP table. Here is the data that was loaded:
SELECT * FROM emp;
We can see that on row 1 the HIERDATE was invalid so it was replaced by the value from sys date (07-JUL-16). Row 2 the value of DEPTNO is the conversion default of 99 and on row 4 the value for MGR is the conversion default of 9999.
The fact that we only loaded 5 rows obviously this means that 4 rows were rejected during the insert process. We look at the rows in the staging table that failed to be inserted by using the VALIDATE_CONVERSION function:
SELECT * FROM staging_emp
WHERE VALIDATE_CONVERSION(empno AS NUMBER) = 0;
We can see the rows that were rejected because they contain errors converting the value in the empno column to a number to create the empno key.
The enhanced CAST function (along with TO_NUMBER, TO_BINARY_FLOAT, TO_BINARY_DOUBLE, TO_DATE, TO_TIMESTAMP, TO_TIMESTAMP_TZ, TO_DSINTERVAL, and TO_YMINTERVAL functions) can help you deal with data conversion errors without having to resort to complicated PL/SQL code or writing data validation routines within your application code.
The new VALIDATE_CONVERSION() function can be used to help you identify column values that cannot be converted to the required data type.
Hope these two features are useful. Enjoy!