问题
I've been playing around with some old code, and I came across a function that I made a while ago that calculates the number of times each alphabetical letter appears in a given string. In my initial function, I would loop through the string 26 times counting the number of times each letter appears as it loops through. However, I knew that was really inefficient, so instead I tried to do this:
int *frequency_table(char *string) {
int i;
char c;
int *freqCount = NULL;
freqCount = mallocPtr(freqCount, 26, sizeof(int), "freqCount"); /* mallocs and checks for out of memory */
for (i = 0; string[i] != '\0'; i++) {
c = string[i];
if (isalpha(c)) {
isupper(c) ? freqCount[c - 65]++ : freqCount[c - 97]++;
}
}
return (freqCount);
}
The code above loops through a string and checks each character. If the character is an alphabetic letter (a-z or A-Z), then I increment the frequency count at a specific index in the freqCount
array (where index 0 = a\A, 1 = b\B, ... , 25 = z\Z).
The code seems to be counting fine, but when I print the array, I get the following output:
String: "abcdefghijklmnopqrstuvwxyziii"
a/A -1276558703
b/B 32754
c/C -1276558703
d/D 32754
e/E 862570673
f/F 21987
g/G 862570673
h/H 21987
i/I 4
j/J 1
k/K 1
l/L 1
m/M 1
n/N 1
o/O 1
p/P 1
q/Q 1
r/R 1
s/S 1
t/T 1
u/U 1
v/V 1
w/W 1
x/X 1
y/Y 1
z/Z 1
For reference, I'm printing the array in the following manner:
for (i = 0; i < 26; i++) {
printf("%c/%c %d\n", i + 97, i + 65, freqCount[i]);
}
I checked to make sure that the pointer allocated properly, I know for sure I didn't overwrite this memory location. Maybe I'm missing something but I really can't figure out why it's printing garbage memory values from a\A-h\H.
Also, if there is a more efficient way to do what I'm trying to do, I'd love to hear it.
Thanks
回答1:
- As many mentioned you have to initialize value to 0
- Also you can use below trick to speed up letter counting: if it is a letter you clear the bit 32, which is the bit difference between uppercase and lowercase, which will give you the correct index.
- Last, you can use a short array unless you expect a LOT of letters.
#include <stdio.h>
#include <stdlib.h>
short *frequency_table(char *string){
char c;
short *freqCount;
if (!(freqCount = (short*)calloc(26, sizeof(short))))
return NULL;
for(int i = 0; (c = string[i]) != '\0'; i++) {
if(isalpha(c))
freqCount[(c & ~32) - 'A']++;
}
return(freqCount);
}
Main Test:
int main() {
short *n = frequency_table("helloiHEllo6456gdrgd#%#^#$^#_thirde");
for (char c = 'a'; c <= 'z'; c++)
printf("%c: %d\n", c, n[c - 'a']);
return 0;
}
回答2:
There are 2 problems in your code:
- the array
freqCount
is uninitialized. - you should avoid passing
char
values toisalpha
because it would cause undefined behavior ifstring
contains negativechar
values on systems wherechar
is signed by default.
Instead of a ternary operator or an if
statement, you can use toupper()
to convert lowercase characters to uppercase, and it is more readable to write 'A'
or 'a'
instead of their hard coded ASCII values 65
and 97
.
Here is a corrected version:
int *frequency_table(const char *string) {
size_t i;
/* allocate the array with malloc and check for out of memory */
int *freqCount = mallocPtr(freqCount, 26, sizeof(int), "freqCount");
for (i = 0; i < 26; i++) {
freqCount[i] = 0;
}
for (i = 0; string[i] != '\0'; i++) {
unsigned char c = string[i];
if (isalpha(c)) {
/* this code assumes ASCII, so 'Z'-'A' == 25 */
freqCount[toupper(c) - 'A']++;
}
}
return freqCount;
}
回答3:
the following proposed code:
- avoids
malloc()
,calloc()
, etc - keeps the definition of data, etc inside the
main()
function - performs the desired functionality
- cleanly compiles
- uses simple character literals rather than 'magic' numbers
- is expecting the ASCII character set
and now, the proposed code:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#define MAX_ALPHA 26
void charCounter( char *, int * );
int main( void )
{
char string[] = "abcdefghijklmnopqrstuvwxyziii";
int freqCount[ MAX_ALPHA ] = {0};
charCounter( string, freqCount );
for( size_t i = 0; i < 26; i++)
{
printf("%c/%c %d\n", (char)(i + 'A'), (char)(i + 'a'), freqCount[i]);
}
}
void charCounter( char *string, int freqCount[] )
{
for( size_t i=0; string[i]; i++ )
{
if( isalpha( string[i] ) )
{
freqCount[ toupper(string[i]) - 'A' ]++;
}
}
}
a run of the code results in:
A/a 1
B/b 1
C/c 1
D/d 1
E/e 1
F/f 1
G/g 1
H/h 1
I/i 4
J/j 1
K/k 1
L/l 1
M/m 1
N/n 1
O/o 1
P/p 1
Q/q 1
R/r 1
S/s 1
T/t 1
U/u 1
V/v 1
W/w 1
X/x 1
Y/y 1
Z/z 1
来源:https://stackoverflow.com/questions/61148630/calculate-the-number-of-times-each-letter-appears-in-a-string