Close Menu

    Subscribe to Updates

    Get the latest tech news

    Facebook X (Twitter) Instagram
    TechArenaTechArena
    • Home
    • News
    • Reviews
    • Features
      • Top 5
    • Startups
    • Contact
    Facebook X (Twitter) Instagram
    TechArenaTechArena
    Home»Features»Building trust: Foundations of security, safety and transparency in AI
    Features

    Building trust: Foundations of security, safety and transparency in AI

    Kaluka wanjalaBy Kaluka wanjalaJanuary 30, 2025Updated:August 18, 20258 Mins Read
    Facebook Twitter Telegram LinkedIn WhatsApp Email Pinterest
    Artificial intelligence
    Artificial intelligence
    Share
    Facebook Twitter LinkedIn WhatsApp Telegram

    As publicly available artificial intelligence (AI) models rapidly evolve, so do the potential security and safety implications, which calls for a greater understanding of their risks and vulnerabilities. To develop a foundation for standardised security, safety and transparency in the development and operation of AI models–as well as their open ecosystems and communities–we must change how we’re approaching current challenges, such as consistent information about models, lack of distinction between security and safety issues and deficient and non-standardised safety evaluations available and in use by model makers.

    1. Risks and vulnerabilities

    While similar, AI security and AI safety are distinct aspects of managing risks in AI systems. AI security protects the systems from external and internal threats, while AI safety provides confidence that the system and data don’t threaten or harm users, society or the environment due to the model’s operation, training or use. However, the relationship between AI security and safety is often blurry.

    An attack that would typically be considered a security concern can lead to safety issues (or vice versa), such as the model producing toxic or harmful content or exposing personal information. The intersection of AI security and safety highlights the critical need for a comprehensive approach to AI risk management that addresses both security and safety concerns in tandem.

    1. Current challenges and trends

    While the AI industry has taken steps to address security and safety issues, several key challenges remain, like the prioritisation of speed over safety, inadequate governance and deficient reporting practices. Emerging trends suggest that targeting these areas of growth are crucial for developing effective safety, security and transparent practices in AI.

    1. Speed over safety

    In the spirit of developing and deploying AI technologies quickly to “secure” increased market share, many organisations are prioritising quickening their pace to market over safety testing and ethical considerations. As seen via past security incidents, security is often years behind nascent technology, typically leading to a major incident before the industry begins to self-correct. It’s reasonable to predict that in the absence of individuals pushing for risk management in AI, we may experience a significant and critical safety and security incident. While new models are being introduced with security and safety in mind, the lack of consensus around how to convey the necessary safety and transparency information makes them challenging to evaluate, though the increase in safety-conscious models is a positive step forward for the AI industry.

    1. Governance and self-regulation

    With very little government legislation in effect, the AI industry has relied upon voluntary self-regulation and non-binding ethical guidelines, which have proven to be insufficient in addressing security and safety concerns. Additionally, proposed legislation often doesn’t align with the realities of the technology industry or concerns raised by industry leaders and communities, while corporate AI initiatives can fail to address structural issues or provide meaningful accountability as a result of being developed especially for their own use.

    Self-governance has had limited success and tends to involve a defined set of best practices implemented independent of primary feature development. As seen historically across industries, prioritising security at the expense of capability is often a trade off stakeholders are unwilling to make. AI further complicates this by extending this challenge to include direct impacts to safety.

    1. Deficient reporting practices

    As the industry currently stands, there is a lack of common methods and practices in handling user-reported model flaws. This is partially due to the fact that the industry’s flawed-yet-functional disclosure and reporting system for software vulnerabilities isn’t an apples-to-apples solution for reporting in AI. AI is a technical evolution of data science and machine learning (ML), distinct from traditional software engineering and technology development due to its focus on data and math and less on building systems for users that have established methodologies for threat modelling, user interaction and system security. Without a well understood disclosure and reporting system for safety hazards, reporting an issue by directly reaching out to the model maker may be cumbersome and unrealistic.  Without a well understood, standardised reporting process, the impact of an AI safety incident could potentially be far more egregious than it should be, due to delayed coordination and resolution.

    1. Solutions and strategies

    Heavily drawing upon prior work by Cattel, Ghosh & Kaffee (2024), we believe that extending model/system cards and hazard tracking are vital to the improvement of security and safety in the AI industry.

    1. Extending model/safety cards

    Model cards are used to document the possible use of an AI model, as well as its architecture and occasionally the training data used for the model. Model cards are currently used to provide an initial set of human-generated material about the model that’s then used to assess its viability, but model cards could have more potential and applicability beyond their current usage, regardless of where they travel or where they’re deployed.

    To effectively compare models, adopters and engineers need a consistent set of fields and content present on the card, which can be accomplished through specification. In addition to the fields recommended by Barnes, Gebru, Hutchinson, Mitchell, Raji, Spitzer, Vasserman, Wu & Zaldivar, 2019, we propose the following changes and additions:

    • Expanding intent and use to describe the users (who) and use cases (what) of the model, as well as how the model is to be used.
    • Add scope to exclude known issues that the model producer doesn’t intend or have the ability to resolve. This will ensure that hazard reporters understand the purpose of the model before reporting a concern that’s noted as unaddressable against its defined use.
    • Adjust evaluation data to provide a nested structure to convey if a framework was also used, and the evaluation’s outputs that were run on the model. Standardised safety evaluations would enable a skilled user to build a sustainably equivalent model.
    • Add governance information about the model to understand how an adopter or consumer can engage with the model makers or understand how it was produced.
    • Provide optional references, such as artifacts and other content, to help potential consumers understand the model’s operation and demonstrate the maturity and professionalism of a given model.

    Requiring these fields for model cards allows the industry to begin establishing content that is essential for reasoning, decision making and reproducing models. By developing an industry standard for model cards, we will be able to promote interoperability of models and their metadata across ecosystems.

    1. Hazard tracking

    While the common vulnerability disclosure process used to track security flaws is effective in traditional software security, its application in AI systems faces several challenges. For one, ML model issues must satisfy statistical validity thresholds. This means that any issues or problems identified in an AI model, such as biases, must be measured and evaluated against established statistical standards to ensure that they’re meaningful and significant. Secondly, concerns related to trustworthiness and bias often extend beyond the scope of security vulnerabilities and may not align with the accepted definition. Recognising these limitations, we believe that expanding the ecosystem with a centralised, neutral coordinated hazard disclosure and exposure committee and a common flaws and exposure (CFE) number could satisfy these concerns. This is similar to how CVE was launched in 1999 by MITRE to identify and categorise vulnerabilities in software and firmware.

    Users who discover safety issues are expected to coordinate with the model providers to triage and further analyse the issue. Once the issue is established as a safety hazard, the committee assigns a CFE number. Model makers and distributors can also request CFE numbers to track safety hazards they find in their own models. The coordinated hazard disclosure and exposure committee is the custodian of CFE numbers and is responsible for assigning them to safety hazards, tracking them and publishing them. Additionally, the formation of an adjunct panel will be responsible for facilitating the resolution of contested safety hazards.

    1. What next?

    Models developed according to open source principles have the potential to play a significant role in the future of AI. The frameworks and tools that are necessary for developing and managing models against industry and consumer expectations require openness and consistency in order for organisations to reasonably assess risk. With more transparency and access to critical functionality, the greater our ability to discover, track and resolve safety and security hazards before they have widespread impact. Our proposals intend to afford flexibility and consistency through existing governance, workflows and structure. When implemented, they could provide more efficient avenues to resolving the pressing need to effectively manage AI safety.

    Also Read: Red Hat announces latest updates to global partner engagement experience

    artificial intelligence
    Kaluka wanjala
    • Website
    • Facebook
    • X (Twitter)
    • LinkedIn

    Editor at TechArena. I cover all things technology and review new gadgets as I get them. You can reach me on email: [email protected]

    Related Posts

    BuuPass Becomes First African Top 3 Finalist at Startup World Cup

    October 23, 2025

    Airtel Africa Calls for Stronger Partnerships to Power the Continent’s Digital Future

    October 23, 2025

    NTT DATA Expands Insurance Technology Expertise with Alchemy Acquisition

    October 23, 2025
    Leave A Reply Cancel Reply

    This site uses Akismet to reduce spam. Learn how your comment data is processed.

    Latest Posts

    Kenya’s Spiro Secures $100 Million to Accelerate Africa’s Electric Mobility Revolution

    October 23, 2025

    BuuPass Becomes First African Top 3 Finalist at Startup World Cup

    October 23, 2025

    Airtel Africa Calls for Stronger Partnerships to Power the Continent’s Digital Future

    October 23, 2025

    NTT DATA Expands Insurance Technology Expertise with Alchemy Acquisition

    October 23, 2025
    Advertisement
    Editor's Pick

    Why attack surface management must look both inside and out

    October 22, 2025

    T-Bin: The Kenyan Startup Using Smart Bins to Revolutionize Waste Management in Africa

    October 21, 2025

    M-KOPA’s Financing Model is Powering Kenya’s Electric Mobility Transition: A Conversation with Brian Njao

    October 21, 2025

    Funder Wants to Be Kenya’s Digital Meeting Ground for Startups and Investors

    October 21, 2025
    © 2025 TechArena.. All rights reserved.
    • Home
    • Startups
    • Reviews

    Type above and press Enter to search. Press Esc to cancel.

    Ad Blocker Enabled!
    Ad Blocker Enabled!
    Our website is made possible by displaying online advertisements to our visitors. Please support us by disabling your Ad Blocker.