cay-zhang / swiftspeech Goto Github PK
View Code? Open in Web Editor NEWA speech recognition framework designed for SwiftUI.
License: MIT License
A speech recognition framework designed for SwiftUI.
License: MIT License
I assign the speech recognition string to a @State var speechRecogText and I check if the other hardcoded string private var text contains the string from the speech recognition. This works properly and prints contains... in English. But with the Arabic Language, it does not work.
@State var speechRecogText: String = ""
private var text: String = "قل"
if textFieldText.contains(speechRecogText) {
print("contaions voice text")
} else {
print("doesn't contain voice text")
}
Console
// doesn't contain voice text
However when I try to swap the variables like this:
if speechRecogText.contains(textFieldText) {
print("contains text")
} else {
print("doesnt contain text")
}
Console
// contains text
What might be the reason for this? Does it have to do anything with the Language or how Strings actually behave?
Firstly, the library is awesome.
But I bumped an issue when trying to use this with the text to speech.
Simply you can also reproduce the issue with the following code;
import AVFoundation
func onSpeechToTextEnded() {
let utterance = AVSpeechUtterance(string: "Hello world")
utterance.voice = AVSpeechSynthesisVoice(language: "en-GB")
let synthesizer = AVSpeechSynthesizer()
synthesizer.speak(utterance)
}
if I try to call this function (onSpeechToTextEnded) before actually using this library, I can hear the voice.
But when I try calling this function to hear some voices it is now working.
Can you investigate the issue please
@Cay-Zhang
Adding this to Swift Playgrounds on iPadOS results in an error message: “package doesn't have version tags”. I fixed that in my fork by adding a new tag removing the “v“ prefix.
You can also have a look at this:
erikdoe/ocmock#496
Is it possible to stop the recording from somewhere else in my logic?
swiftSpeechState
is a getOnly property.
Thanks!
Maybe like,
SwiftSpeech.RecordButton()
.swiftSpeechRecordOnHold(sessionConfiguration:animation:distanceToCancel:)
.onRecognizeLatest(update: $text)
.onFinal(saveTo: url)
Is there a way to get the recorded audio instead of only recognized text?
Worked fine up to iOS 15.6. I tested it on IOS 16 beta 2 on iPhone 12 Pro Max and I always get Thread 1: Fatal error : recordingSession is nil in EndRecording() in the following function when I release the speech button.
fileprivate func endRecording() {
guard let session = recordingSession else { preconditionFailure("recordingSession is nil in (#function)") }
recordingSession?.stopRecording()
delegate.onStopRecording(session: session)
self.viewComponentState = .pending
self.recordingSession = nil
}
Hi!
I've encountered an exception due to a bug in iOS 14 beta 4 + AirPods that breaks (on software level, hopefully) mic on AirPods. In system apps (iMessage, Recorder ...) the issue prevents voice recording/recognition from working but it does not crash an app. In case of SwiftSpeech, the app crashes with uncaught exception.
Is is possible to catch such a failure and gracefully pass notification with the error? Or at least prevent the crash.
The log message is:
Terminating app due to uncaught exception 'com.apple.coreaudio.avfaudio', reason: 'required condition is false: IsFormatSampleRateAndChannelCountValid(format)'
My app involves both SwiftSpeech's features and sound effects through SwiftySound. All sound effects work fine in the simulator, but on the device all sound effects stop working once the SwiftSpeech button is pressed. I have one button that makes a "click" noise when pressed. It works until I press the SwiftSpeech button.
If I press the SwiftSpeech button first, I get a sound effect the first time, but then not for subsequent presses.
I made a new, simple project without SwiftSpeech just to test the sound, and everything worked fine on the device. I also switched out SwiftySound and used the normal AVAudioPlayer procedure, and the sound works that way, too.
So the only thing I can think of is that there is a conflict between SwiftSpeech and the sound effects. Is it possible that SwiftSpeech is turning off my sound effects? If so, how do I turn them back on? My code appears below:
`import SwiftUI
import SwiftSpeech
import SwiftySound
import AVFoundation
import AudioToolbox
// VIEW MODEL
struct ContentView: View {
let emojiArray = ["🐵","🦍","🐶","🐺","🦊","🦝","🐱","🦁","🐅","🐴","🦓","🦌","🐮","🐷","🐐","🐪","🦙","🦒","🐘","🦏","🦛","🐁","🐀","🐰","🦇","🐻","🐨","🐼","🦘","🦃","🐔","🐧","🦅","🦆","🦢","🦉","🦚","🐸","🐊","🐢","🦎","🐍","🐳","🐬","🐟","🐙","🐌","🦋","🐜","🐝","🐞","🦗","🕷","🦂","🦟"]
@State private var emoji = ""
@State private var nextEmoji = ""
@State private var text = "What is this? (Press and Hold)"
@State private var theDescription = ""
@State var isCorrect:Bool
@State var player = AVAudioPlayer()
var body: some View {
ZStack(alignment:.top) {
VStack(alignment: .center) {
Text (emoji).font(.system(size: 200, weight: .bold, design: .default))
.onAppear() {
emoji = emojiArray.randomElement() ?? "none"
theDescription = emoji.applyingTransform(.toUnicodeName, reverse: false) ?? "None"
print (theDescription) // get the emoji's unicode name
}
Text (text)
.onAppear {
SwiftSpeech.requestSpeechRecognitionAuthorization()
}
.padding()
SwiftSpeech.RecordButton()
}
.swiftSpeechRecordOnHold()
.onRecognize { _, result in
text = result.bestTranscription.formattedString
print (text)
self.text = text
if theDescription.contains(self.text.uppercased()) == true {
print ("That's right")
text = "That's right!"
isCorrect = true
playRightSound()
}
else {print ("That's wrong")
text = "Try again!"
isCorrect = false
playWrongSound()
}
} handleError: { _, _ in }
Spacer()
Button("Change Animal") {
nextEmoji = emojiArray.randomElement() ?? "none"
while nextEmoji == emoji {
nextEmoji = emojiArray.randomElement() ?? "none"
}
playClickSound()
emoji = nextEmoji
text = "What is this? (Press and hold)"
theDescription = emoji.applyingTransform(.toUnicodeName, reverse: false) ?? "None"
print (theDescription)
}
}
}
func playRightSound(){
print ("Playing right sound")
Sound.play(file:"yay.wav")
}
func playWrongSound() {
Sound.play(file:"raspberry.wav")
}
func playClickSound() {
Sound.play(file:"click.wav")
}
struct ContentView_Previews: PreviewProvider {
static var previews: some View {
ContentView(isCorrect:true)
}
}
}
`
Is there a way to make recording start automatically when I display SwiftSpeech.Demos.Basic ?
when i put this install url,i got error“Unable to find a specification for 'SwiftSpeech'.”
would you please tell me how can i install it through cocoapods?
Hello ✌️ Thank you for such a wonderful library!
In my app I wanted to implement something similar to dictation button in Safari search:
This way user doesn't need to tap on a button again to stop dictation. There are built-in methods swiftSpeechRecordOnHold
and swiftSpeechToggleRecordingOnTap
, but both of them need additional interaction from user. Also there was a need for different button.
Here is how I solved this, maybe this will be helpful for somebody in the future. Will be happy to hear any comments on how this can be done better:
import SwiftUI
import SwiftSpeech
// Creating new extension with custom record button view
public extension SwiftSpeech {
struct RecordButtonCustom: View {
public var body: some View {
RecordButtonView()
}
}
}
// Define new EnvironmentKey for custom state
struct DictationState: EnvironmentKey {
static let defaultValue: SwiftSpeech.State = .pending
}
// Define new Environment Values for custom state
extension EnvironmentValues {
var dictationState: SwiftSpeech.State {
get {
self[DictationState.self]
}
set {
self[DictationState.self] = newValue
}
}
}
struct SwiftSpeechView: View {
@State private var text = "Tap to Speak"
@State private var timer: Timer?
@State var dictationState: SwiftSpeech.State = .pending
var body: some View {
VStack() {
Text(text)
SwiftSpeech
.RecordButtonCustom()
.swiftSpeechToggleRecordingOnTap(locale: Locale(identifier: "en_US"))
.onRecognizeLatest(
includePartialResults: true,
handleResult: { session, result in
text = result.bestTranscription.formattedString
timer?.invalidate()
// initiate timer to stop recording after 2 seconds of silence
timer = Timer.scheduledTimer(withTimeInterval: 2.0, repeats: false) { timer in
session.stopRecording()
dictationState = .pending
}
},
handleError: { session, error in
text = "Error \((error as NSError).code)"
session.stopRecording()
dictationState = .pending
})
.onStartRecording { session in
dictationState = .recording
}
.onStopRecording { session in
dictationState = .pending
}
.onCancelRecording{ session in
dictationState = .cancelling
}
}
.onAppear {
SwiftSpeech.requestSpeechRecognitionAuthorization()
}
.environment(\.dictationState, dictationState)
}
}
#Preview {
SwiftSpeechView()
}
import SwiftUI
import SwiftSpeech
struct RecordButtonView: View {
@Environment(\.dictationState) var state: SwiftSpeech.State
public init() { }
var icon: String {
switch state {
case .pending:
return "mic"
case .recording:
return "mic.fill"
case .cancelling:
return "xmark"
}
}
public var body: some View {
Button("Dictate", systemImage: icon, action: {
print("Dictate")
})
.buttonStyle(.borderless)
.labelStyle(.iconOnly)
.help("Dictate")
}
}
#Preview {
RecordButtonView()
}
My stack:
Xcode 15.1 beta
visionOS 1.0
I know from the examles that SwiftSpeech can handle all supported languages. But I don't see how to implement this functionality. I gather from the example that I must add like this for Hebrew:
public init(locale: Locale = .autoupdatingCurrent) { self.locale = locale } public init(localeIdentifier: String) { self.locale = Locale(identifier: "he-IL") }
but I don't understand how to use this setting in SwiftSpeech:
`Text (text)
.onAppear {
SwiftSpeech.requestSpeechRecognitionAuthorization()
}
SwiftSpeech.RecordButton()
.swiftSpeechRecordOnHold(sessionConfiguration: .init(audioSessionConfiguration: .playAndRecord))
.onRecognize { _, result in
text = result.bestTranscription.formattedString
self.text = text
if text == word { // word from array, checking pronunciation
playRightSound()
}
else {
playWrongSound()
}
} handleError: { _, _ in }
}
}
}`
I appreciate your help!
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.