Oracle Date column cleaning

前端 未结 3 1692
遥遥无期
遥遥无期 2020-12-12 05:07

I have a table1(DueDate varchar2(20)). It has thousands of date data in different format with some bad data like characters.

eg.

YYYYMMDD,
MM/DD/YYY         


        
相关标签:
3条回答
  • 2020-12-12 05:44

    This is one of the reasons that storing date information in a character field is such a bad idea.

    The easiest option is to create a function that attempts to convert the string to a date using the formats in whatever priority order you have (i.e. is 010203 Jan 2, 2003 or Feb 3, 2001 or something else) and catches the exceptions. Something like

    CREATE OR REPLACE FUNCTION my_to_date( p_str IN VARCHAR2 )
      RETURN DATE
    IS
      l_date DATE;
    BEGIN
      l_date := to_date( p_str, 'YYYYMMDD' );
      RETURN l_date;
    EXCEPTION
      WHEN others THEN
        BEGIN
          l_date := to_date( p_str, 'MM/DD/YYYY' );
          RETURN l_date;
        EXCEPTION 
          WHEN others
          THEN
            RETURN null;
        END;
    END;
    

    which works something like

    SQL> CREATE OR REPLACE FUNCTION my_to_date( p_str IN VARCHAR2 )
      2    RETURN DATE
      3  IS
      4    l_date DATE;
      5  BEGIN
      6    l_date := to_date( p_str, 'YYYYMMDD' );
      7    RETURN l_date;
      8  EXCEPTION
      9    WHEN others THEN
     10      BEGIN
     11        l_date := to_date( p_str, 'MM/DD/YYYY' );
     12        RETURN l_date;
     13      EXCEPTION
     14        WHEN others
     15        THEN
     16          RETURN null;
     17      END;
     18  END;
     19  /
    
    Function created.
    
    SQL> select my_to_date( '19000101' ) from dual;
    
    MY_TO_DAT
    ---------
    01-JAN-00
    
    SQL> ed
    Wrote file afiedt.buf
    
      1* select my_to_date( '01/02/2005' ) from dual
    SQL> /
    
    MY_TO_DAT
    ---------
    02-JAN-05
    

    Of course, you'd have to code the full set of valid date formats in your code, I'm just handling the first two in your list.

    0 讨论(0)
  • 2020-12-12 05:50

    This answer builds on Justin Cave’s. If you create another column that is a date field you can then do something like the following:

    UPDATE table1 SET DueDate2 = my_to_date(DueDate) 
       WHERE DueDate2 IS NULL AND DueDate IS NOT NULL;
    

    You could then re-run this update after each modification of the date format in the function. You would have to stop when all the dates not converted are invalid dates (such as ‘ABCD’).

    0 讨论(0)
  • 2020-12-12 05:57

    Pure SQL solution for Oracle 12.2, where the handy default null on conversion error option avoids the need to write functions and handle exceptions:

    with demo (date_string) as
         ( select '2018081' from dual union all
           select '01/8/2018' from dual union all
           select '01-8-2018' from dual union all
           select 'ABCD' from dual union all
           select '2018-08-01' from dual union all
           select '2018-01-8' from dual union all
           select '2017081' from dual union all
           select '01-AUG-2018' from dual
         )
    select t.date_string
         , coalesce
           ( to_date(t.date_string default null on conversion error)
           , case sys_context('USERENV','NLS_TERRITORY')
                  when 'AMERICA' then to_date(t.date_string default null on conversion error,'MM/DD/YYYY')
                  else to_date(t.date_string default null on conversion error,'DD/MM/YYYY')
             end
           , to_date(t.date_string default null on conversion error,'DD/MM/YYYY')
           , to_date(t.date_string default null on conversion error,'YYYY/MM/DD')
           , to_date(t.date_string default null on conversion error,'YYYYMMDD')
           , to_date(t.date_string default null on conversion error,'DD/MON/YYYY')
           ) as converted_date
    from   demo t
    

    Results:

    DATE_STRING CONVERTED_DATE
    ----------- --------------
    2018081     01-AUG-2018
    01/8/2018   08-JAN-2018
    01-8-2018   08-JAN-2018
    ABCD
    2018-08-01  01-AUG-2018
    2018-01-8   08-JAN-2018
    2017081     01-AUG-2017
    01-AUG-2018 01-AUG-2018
    

    The first test does not specify any format in order to use the session default (DD-MON-RRRR etc).

    The next test attempts to handle the North American habit of placing the month first. (I'm not sure if sys_context('USERENV','NLS_TERRITORY') = 'AMERICA' is a reliable test though, as that tends to be the default setting regardless of actual territory. However it illustrates how you can inject some conditional logic if desired.)

    to_date is pretty forgiving about separator characters so I haven't specified different formats for DD-MON-YYYY vs DD/MON/YYYY etc.

    0 讨论(0)
提交回复
热议问题