I\'m trying to get a grasp on regular expressions and I came across with the one included inside the str.extract method:
movies[\'year\']=movies
Try using this:
movies['year']= movies['title'].str.extract('.*\((\d{4})\).*',expand=False)
First of all, the behavior of Pandas .str.extract() is quite expected: it returns only the capturing group contents. The pattern used with extract requires at least 1 capturing group:
pat : string
Regular expression pattern with capturing groups
If you use a named capturing group, the new column will be named after the named group.
The grep command you provided can be reduced to
grep '\((.*)\)'
as grep is capable of matching a line partially (does not require a full line match) and works on a per line basis: once a match is found the whole line is returned. To override that behavior, you may use -o switch.
With grep, you cannot return the capturing group contents. This can be worked around with PCRE regexp powered with -P option, but it is not available on Mac, for example. sed or awk may help in those situations, too.