Using Mac’s Dictation Inside Python

戏子无情 提交于 2019-12-04 10:58:22

First, FnFn dictation is a feature of the NSText (or maybe NSTextView?) Cocoa control. If you've got one of those, the dictated text gets inserted into that control. (It also uses that control's existing text for context.) From the point of view of the app using an NSTextView, if you just create a standard Edit menu, the Start Dictation item gets added to the end, with FnFn as a shortcut, and anything that gets dictated appears as input, just like input typed on a keyboard, or pasted or dragged with the mouse, or via any other input method.

So, if you don't have a GUI app, enabling dictation is going to be pointless, because you have no way to get the input.

If you do have a GUI app, the simplest thing to do is just get the menu item via NSMenu, and click the item.

You're almost certainly using some kind of GUI library, like PyQt or Tkinter, which has its own way of accessing your app's menu. But if not, you can do it directly through Cocoa (using PyObjC—which comes with Apple's pre-installed Python, but which you'll have to pip install if you're using a third-party Python):

import AppKit
mb = AppKit.NSApp.mainMenu()
edit = mb.itemWithTitle_('Edit').submenu()
sd = edit.indexOfItemWithTitle_('Start Dictation')
edit.performActionForItemAtIndex_(sd)

But if you're writing a console program that runs in the terminal (whether Terminal.app or an alternative like iTerm), the app you're running under has its own text widget and Edit menu, and you can parasitically use its menu instead.

The problem is that you don't have permission to just control other apps unless the user allows it. In older versions of OS X, this was done just by turning on "assistive scripting for accessibility" globally. As of 10.10, there's an Accessibility anchor in the Privacy tab of the Security & Privacy pane of System Preferences that has a list of apps that have permissions. Fortunately, if you're not on the list, the first time you try to use accessibility features, it'll pop up a dialog, and if the user clicks on it, it'll launch System Preferences, reveal that anchor, add your app to the list with the checkbox disabled, and scroll it into view, so all the user has to do is click the checkbox.

The AppleScript to do this is:

tell application "System Events"
    click (menu item "Start Dictation" of menu of menu bar item "Edit" 
        of menu bar of (first process whose frontmost is true))
end tell

The "right" way to do the equivalent in Python is via ScriptingBridge, which you can access via PyObjC… but it's a lot easier to use the third-party library appscript:

from appscript import app, its
se = app('System Events')
proc = app.processes[its.frontmost == True]
mi = proc.menu_bars[1].menu_bar_items['Edit'].menus[1].menu_items['Start Dictation']
mi.click()

If you really want to send the Fn key twice, the APIs for generating and sending keyboard events are part of Quartz Events Services, which (even though it's a CoreFoundation C API, not a Cocoa ObjC API) is also wrapped by PyObjC. The documentation can be a bit tricky to understand, but basically, the idea is that you create an event of the appropriate type, then either post it to a specific application, an event tap, or a tap location. So, you can create and send a system-wide key-down Fn-key event like this:

evt = Quartz.CGEventCreateKeyboardEvent(None, 63, True)
Quartz.CGEventPost(Quartz.kCGSessionEventTap, evt)

To send a key-up event, just change that True to False.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!