Since cores show up as seperate CPUs to the OS, you use the same code you'd use to determine the load per CPU in a multiprocessor machine. One such example (in C) is here. Note that it uses WMI, so the other thread linked in the comments above probably has you most of the way there.