AI companies usually hold a strong grip on their models for discouraging misuse. For example, if you ask a chatiguepot to give you a phone number or instructions for illegal work, it will probably tell you that it may not help. However, as many examples have been shown over time, smart quick engineering or model Fine Toning can sometimes get the get to the models that they will not do otherwise. Unwanted information can still be hidden somewhere inside the model to access the right techniques.
Currently, companies have a tendency to deal with this problem by applying guards. The idea is whether the indicator or AI response contains the content allowed. Instead, the machine asks for an AI to forget a piece of information that the company does not want to know. This technique takes the taxes to recreate a leak model and specific training data and uses them to create a new model. Although AI research has a relationship with the old technique to unsafe, it is only in the last two years that it has been applied to large language models.
Songkonkovan University, Master’s student Janju Kim, who worked on paper with Ko and others, looks at Gardrells as a fence around the bad data kept to keep people away from it. Kim says, “You can’t go through the fence, but some people will still try to get under the fence or fence.” But he says, efforts have been made to completely remove the bad data, so there is nothing behind the fence.
Although the current text -to -speech system is designed, it makes it a bit more complicated. This so-called “zero shot” model uses examples of people’s speech to regenerate any sound, including not in training sets-with coffee data, when a small sample of someone’s sound is supplied, it can be a good imitation. So “unaware” means that a model not only needs to be “forgotten” on which it was trained, but it has to learn not to imitate specific voices that were not trained. At all times, it still needs to perform well for other voices.
To show how to achieve these results, Kim taught Meta Speech Generation Model, Voicebox Recreation, when it was indicated to prepare a text in a voice to recover it, instead it should respond with a random voice. To make these sounds realistic, the model “teaches itself” using random sounds of its creation.
According to the team ResultsWhich is to be presented at an international conference on machine learning this week, indicating the model to imitate a voice, which has made it “unconscious”. Appliance It measures the similarities of the sound. In practice, this has made the new sound different. But forgetfulness comes at a price: the model is about 2. 2.8 % worse for the model permitting sounds. Although it is a bit difficult to translate these percentage, the researchers issued Online Offers many convincing results, for both of how good the speaker is forgotten and how well the rest is remembered. A demo sample is given below.
Co says that the process of misconduct can take “several days”, depending on how many researchers want the model to forget the model. Their method also requires a five -minute -long audio clip for each speaker, which has to forget the sound.
In the machine, the pieces of data are often replaced with random pin so that they cannot be returned to reverse engineer. In this article, random pin is very high for forgotten speakers. This is a sign, the authors claim that they have really forgotten through the model.
“I have seen a PhD student Vedhi Patil at Chapel Hill’s University of North Carolina,” says Vedhi Patil, “I have seen in other contexts improving randomism.” “This is one of the first tasks I have seen for speech.” Patil is arranging to disallow a machine Workshop The conference will also be presented there, and the voice of the sound will also be presented there.