The situation is as follows:
I have a substantial number of tables, with each a substantial number of columns. I need to deal with this old and to-be-deprecated data
You can take advantage of the behavior of COUNT aggregate function regarding NULLs. By passing the field as argument, the COUNT function returns the number of non-NULL values while COUNT(*) returns the total number of rows. Thus you can calculate the ratio of NULL to "acceptable" values.
I will give an example with the following table structure:
CREATE TABLE `t1` (
`id` int(10) unsigned NOT NULL AUTO_INCREMENT,
`col_1` int(10) unsigned DEFAULT NULL,
`col_2` int(10) unsigned DEFAULT NULL,
PRIMARY KEY (`id`)
) ;
-- let's fill the table with random values
INSERT INTO t1(col_1,col_2) VALUES(1,2);
INSERT INTO t1(col_1,col_2)
SELECT
IF(RAND() > 0.5, NULL ,FLOOR(RAND()*1000),
IF(RAND() > 0.5, NULL ,FLOOR(RAND()*1000) FROM t1;
-- run the last INSERT-SELECT statement a few times
SELECT COUNT(col_1)/COUNT(*) AS col_1_ratio,
COUNT(col_2)/COUNT(*) AS col_2_ratio FROM t1;
You can write a function that automatically constructs a query from the INFORMATION_SCHEMA database by passing the table name as input variable. Here's how to obtain the structure data directly from INFORMATION_SCHEMA tables:
SET @query:=CONCAT("SELECT @column_list:=GROUP_CONCAT(col) FROM (
SELECT CONCAT('COUNT(',c.COLUMN_NAME,')/COUNT(*)') AS col
FROM INFORMATION_SCHEMA.COLUMNS c
WHERE NOT COLUMN_KEY IN('PRI') AND TABLE_SCHEMA=DATABASE()
AND TABLE_NAME='t1' ORDER BY ORDINAL_POSITION ) q");
PREPARE COLUMN_SELECT FROM @query;
EXECUTE COLUMN_SELECT;
SET @null_counters_sql := CONCAT('SELECT ',@column_list, ' FROM t1');
PREPARE NULL_COUNTERS FROM @null_counters_sql;
EXECUTE NULL_COUNTERS;