U.K. agency releases tools to test AI model safety | TechCrunch

U.K. agency releases tools to test AI model safety | TechCrunch

The UK Safety Institute, the UK's recently established AI safety body, has released a toolkit to “strengthen AI safety” by simplifying the assessment of AI for industry, research organizations and academia. is designed.

Called Inspect, the toolset — available under an open source license, specifically a MIT License – aims to evaluate some of the capabilities of AI models, including the models' basic knowledge and reasoning ability, and create a score based on the results.

In a press release Announcement In a news release on Friday, the Safety Institute claimed that Inspect marks “the first time that an AI safety testing platform led by a state-backed organization has been released for widespread use.”

A look at the inspection dashboard.

“Successful collaboration on AI safety testing means a common, accessible approach to assessment, and we hope inspections can become part of a building block,” Safety Institute Chair Ian Hogarth said in a statement. “We hope that the global AI community will use Inspect not only to perform security tests of their own models, but to help adopt and build on the open source platform so that we are superior across the board. can assess the quality.”

As we have written before, AI benchmarks are tough – not least of which is because today's most sophisticated AI models are black boxes whose infrastructure, training data and other important details are kept under wraps by the companies that build them. So how does inspection deal with the challenge? Primarily by being scalable and extensible for new testing techniques.

Inspection consists of three basic components: data sets, solvers and scorers. Data sets provide samples for diagnostic tests. Solvers do the test-taking. And scorers evaluate solvers' work and score aggregates in matrices from the tests.

Inspect's built-in components can be extended by third-party packages written in Python.

In a post on X, Deborah Raj, a Mozilla research fellow and leading AI ethicist, called the inspection “proof of the power of public investment in open source tooling for AI accountability.”

Clément Delangue, CEO of AI startup Hugging Face, proposed the idea of ​​integrating Inspect with Hugging Face's model library or creating a public leaderboard with the results of the toolset's assessments.

The release of the inspect follows a government agency of the state – the National Institute of Standards and Technology (NIST). started NIST GenAI, a program to evaluate various generative AI technologies, including text and image generating AI. NIST plans to release GenAI benchmarks, help build content authentication systems, and encourage the development of software to detect fake or misleading AI-generated information.

In April, the US and UK announced a partnership to jointly develop advanced AI model testing, followed by the UK's AI Safety Summit At Bletchley Park in November last year. As part of the collaboration, the U.S. plans to launch its own AI Safety Institute, which will be charged with assessing risks from widespread AI and generative AI.

About the Author

Leave a Reply