AI Archives - SD Times https://sdtimes.com/tag/ai/ Software Development News Fri, 01 Nov 2024 15:23:47 +0000 en-US hourly 1 https://wordpress.org/?v=6.5.5 https://sdtimes.com/wp-content/uploads/2019/06/bnGl7Am3_400x400-50x50.jpeg AI Archives - SD Times https://sdtimes.com/tag/ai/ 32 32 IBM releases open AI agents for resolving GitHub issues https://sdtimes.com/softwaredev/ibm-releases-open-ai-agents-for-resolving-github-issues/ Fri, 01 Nov 2024 15:23:47 +0000 https://sdtimes.com/?p=55973 IBM is releasing a family of AI agents (IBM SWE-Agent 1.0) that are powered by open LLMs and can resolve GitHub issues automatically, freeing up developers to work on other things rather than getting bogged down by their backlog of bugs that need fixing.  “For most software developers, every day starts with where the last … continue reading

The post IBM releases open AI agents for resolving GitHub issues appeared first on SD Times.

]]>
IBM is releasing a family of AI agents (IBM SWE-Agent 1.0) that are powered by open LLMs and can resolve GitHub issues automatically, freeing up developers to work on other things rather than getting bogged down by their backlog of bugs that need fixing. 

“For most software developers, every day starts with where the last one left off. Trawling through the backlog of issues on GitHub you didn’t deal with the day before, you’re triaging which ones you can fix quickly, which will take more time, and which ones you really don’t know what to do with yet. You might have 30 issues in your backlog and know you only have time to tackle 10,” IBM wrote in a blog post. This new family of agents aims to alleviate this burden and shorten the time developers are spending on these tasks. 

One of the agents is a localization agent that can find the file and line of code that is causing an error. According to IBM, the process of finding the correct line of code related to a bug report can be a time-consuming process for developers, and now they’ll be able to tag the bug report they’re working on in GitHub with “ibm-swe-agent-1.0” and the agent will work to find the code. 

Once found, the agent suggests a fix that the developer could implement. At that point the developer could either fix the issue themselves or enlist the help of other SWE agents for further assistants. 

Other agents in the SWE family include one that edits lines of code based on developer requests and one that can be used to develop and execute tests. All of the SWE agents can be invoked directly from within GitHub.

According to IBM’s early testing, these agents can localize and fix problems in less than five minutes and have a 23.7% success rate on SWE-bench tests, a benchmark that tests an AI system’s ability to solve GitHub issues. 

IBM explained that it set out to create SWE agents as an alternative to other competitors who use large frontier models, which tend to cost more. “Our goal was to build IBM SWE-Agent for enterprises who want a cost efficient SWE agent to run wherever their code resides — even behind your firewall — while still being performant,” said Ruchir Puri, chief scientist at IBM Research.

The post IBM releases open AI agents for resolving GitHub issues appeared first on SD Times.

]]>
Opsera and Databricks partner to automate data orchestration https://sdtimes.com/data/opsera-and-databricks-partner-to-automate-data-orchestration/ Wed, 30 Oct 2024 19:38:27 +0000 https://sdtimes.com/?p=55952 Opsera, the Unified DevOps platform powered by Hummingbird AI trusted by top Fortune 500 companies, today announced that it has partnered with Databricks, the Data and AI company, to empower software and DevOps engineers to deliver software faster, safer and smarter through AI/ML model deployments and schema rollback capabilities. Opsera leverages its DevOps platform and … continue reading

The post Opsera and Databricks partner to automate data orchestration appeared first on SD Times.

]]>
Opsera, the Unified DevOps platform powered by Hummingbird AI trusted by top Fortune 500 companies, today announced that it has partnered with Databricks, the Data and AI company, to empower software and DevOps engineers to deliver software faster, safer and smarter through AI/ML model deployments and schema rollback capabilities.

Opsera leverages its DevOps platform and integrations and builds AI agents and frameworks to revolutionize the software delivery management process with a unique approach to automating data orchestration.
Opsera is now part of Databricks’ Built on Partner Program and Technology Partner Program.

The partnership enables:
● AI/ML Model Deployments with Security and Compliance Guardrails: Opsera
ensures that model training and deployment using Databricks infrastructure meets
security and quality guardrails and thresholds before deployment. Proper model training
allows customers to optimize Databricks Mosaic AI usage and reduce deployment risks.

● Schema Deployments with Rollback Capabilities: Opsera facilitates controlled
schema deployments in Databricks with built-in rollback features for enhanced flexibility
and confidence. Customers gain better change management and compliance tracking
and reduce unfettered production deployments, leading to increased adoption of
Databricks and enhanced value of automation pipelines.

“The development of advanced LLM models and Enterprise AI solutions continues to fuel an
insatiable demand for data,” said Torsten Volk, Principal Analyst at Enterprise Strategy Group.
“Partnerships between data management and data orchestration vendors to simplify the
ingestion and ongoing management of these vast flows of data are necessary responses to
these complex and extremely valuable AI efforts.”

Additional benefits of the Opsera and Databricks partnership include:
● Powerful ETL (Extract, Transform, Load) Capabilities: Databricks’ Spark-based
engine enables efficient ETL from various sources into a centralized data lake. This
empowers Opsera to collect and orchestrate vast amounts of data, increasing developer
efficiency and accelerating data processing efficiency.
● Scalable and Flexible Data Intelligence Platform: Databricks’ Delta UniForm and
Unity Catalog provide a scalable, governed, interoperable, and reliable Data Lakehouse
solution, enabling Opsera to orchestrate large volumes of structured and unstructured
data efficiently.
● Advanced Analytics and ML: Databricks Mosaic AI’s integrated machine learning
capabilities allow Opsera to efficiently build and deploy AI/ML models for predictive
analytics, anomaly detection and other advanced use cases.
● Seamless Integration: Databricks integrates seamlessly with Opsera’s existing
technology stack, facilitating smooth data flow and enabling end-to-end visibility of the
DevOps platform.

The post Opsera and Databricks partner to automate data orchestration appeared first on SD Times.

]]>
Tabnine’s new Code Review Agent validates code based on a dev team’s unique best practices and standards https://sdtimes.com/ai/tabnines-new-code-review-agent-validates-code-based-on-a-dev-teams-unique-best-practices-and-standards/ Wed, 30 Oct 2024 15:24:58 +0000 https://sdtimes.com/?p=55948 The AI coding assistant provider Tabnine is releasing a private preview for its Code Review Agent, a new AI-based tool that validates software based on the development team’s unique best practices and standards for software development.  According to Tabnine, using AI to review code is nothing new, but many of the tools currently available check … continue reading

The post Tabnine’s new Code Review Agent validates code based on a dev team’s unique best practices and standards appeared first on SD Times.

]]>
The AI coding assistant provider Tabnine is releasing a private preview for its Code Review Agent, a new AI-based tool that validates software based on the development team’s unique best practices and standards for software development. 

According to Tabnine, using AI to review code is nothing new, but many of the tools currently available check code against general standards. However, software development teams often develop their own unique ways of creating software. “What one team sees as their irrefutable standard, another team might reject outright. For AI to add meaningful value in improving software quality for most teams, it must have the same level of understanding as a fully onboarded, senior member of the team,” Tabnine explained in a blog post

Code Review Agent allows teams to create rules based on their own standards, best practices, and company policies. These rules are then applied during code review at the pull request or in the IDE.

Development teams can provide the parameters their code should comply with in natural language, and Tabnine works behind the scenes to convert that into a set of rules. Tabnine also offers a set of predefined rules that can be incorporated into the ruleset as well. 

For example, one of Tabnine’s predefined rules is “Only use SHA256 to securely hash data” and a customer-specific rule is “Only use library acme_secure_api_access for accessing external APIs, do not use standard http libraries.”

When a developer creates a pull request that doesn’t meet the established rules, Code Review Agent flags the issue to the code review and also offers suggestions on how to fix the problem. 

“By comprehensively reading through code and ensuring that it matches each team’s unique expectations, Tabnine saves engineering teams significant time and effort while applying a level of rigor in code review that was never possible with static code analysis. Just like AI code generation automates away simpler coding tasks so developers can focus on more valuable tasks, Tabnine’s AI Code Review agent automates common review tasks, freeing up code reviewers to focus on higher-order analysis instead of adherence to best practices,” Tabnine wrote. 

This tool is currently available as a private preview to Tabnine Enterprise customers. An example video of Code Review Agent in action can be viewed here

The post Tabnine’s new Code Review Agent validates code based on a dev team’s unique best practices and standards appeared first on SD Times.

]]>
GitHub Copilot now offers access to new Anthropic, Google, and OpenAI models https://sdtimes.com/ai/github-copilot-now-offers-access-to-anthropic-google-and-openai-models/ Tue, 29 Oct 2024 16:33:22 +0000 https://sdtimes.com/?p=55931 GitHub is hosting its annual user conference, GitHub Universe, today and tomorrow, and has announced a number of new AI capabilities that will enable developers to build applications more quickly, securely, and efficiently.  Many of the updates were across GitHub Copilot. First up, GitHub announced that users now have access to more model choices thanks … continue reading

The post GitHub Copilot now offers access to new Anthropic, Google, and OpenAI models appeared first on SD Times.

]]>
GitHub is hosting its annual user conference, GitHub Universe, today and tomorrow, and has announced a number of new AI capabilities that will enable developers to build applications more quickly, securely, and efficiently. 

Many of the updates were across GitHub Copilot. First up, GitHub announced that users now have access to more model choices thanks to partnerships with Anthropic, Google, and OpenAI. Newly added model options include Anthropic’s Claude 3.5 Sonnet, Google’s Gemini 1.5 Pro, and OpenAI’s GPT-4o, o1-preview, and o1-mini. 

By offering developers more choices, GitHub is enabling them to choose the model that works best for their specific use case, the company explained.

“In 2024, we experienced a boom in high-quality large and small language models that each individually excel at different programming tasks. There is no one model to rule every scenario, and developers expect the agency to build with the models that work best for them,” said Thomas Dohmke, CEO of GitHub. “It is clear the next phase of AI code generation will not only be defined by multi-model functionality, but by multi-model choice. Today, we deliver just that.”

Copilot Workspace has a number of new features as well, like a build and repair agent, brainstorming mode, integrations with VS Code, and iterative feedback loops. 

GitHub Models, which enables developers to experiment with different AI models, has a number of features now in public preview, including side-by-side model comparison, support for multi-modal models, the ability to save and share prompts and parameters, and additional cookbooks and SDK support in GitHub Codespaces.

Copilot Autofix, which analyzes and provides suggestions about code vulnerabilities, added security campaigns, enabling developers to triage up to 1,000 alerts at once and filter them by type, severity, repository, and team. The company also added integrations with ESLint, JFrog SAST, and Black Duck Polaris. Both security campaigns and these partner integrations are available in public preview. 

Other new features in GitHub Copilot include code completion in Copilot for Xcode (in public preview), a code review capability, and the ability to customize Copilot Chat responses based on a developer’s preferred tools, organizational knowledge, and coding conventions.

In terms of what’s coming next, starting November 1, developers will be able to edit multiple files at once using Copilot Chat in VS Code. Then, in early 2025, Copilot Extensions will be generally available, enabling developers to integrate their other developer tools into GitHub Copilot, like Atlassian Rovo, Docker, Sentry, and Stack Overflow.

The company also announced a technical preview for GitHub Spark, an AI tool for building fully functional micro apps (called “sparks”) solely using text prompts. Each spark can integrate external data sources without requiring the creator to manage cloud resources. 

While developers can make changes to sparks by diving into the code, any user can iterate and make changes entirely using natural language, reducing the barrier to application development. 

Finished sparks can be immediately run on the user’s desktop, tablet, or mobile device, or they can share with others, who can use it or even build upon it. 

“With Spark, we will enable over one billion personal computer and mobile phone users to build and share their own micro apps directly on GitHub—the creator network for the Age of AI,” said Dohmke.

And finally, the company revealed the results of its Octoverse report, which provides insights into the world of open source development by studying public activity on GitHub. 

Some key findings were that Python is now the most used language on the platform, AI usage is up 98% since last year, and the number of global developers continues increasing, particularly across Africa, Latin America, and Asia. 

The post GitHub Copilot now offers access to new Anthropic, Google, and OpenAI models appeared first on SD Times.

]]>
Tech companies are turning to nuclear energy to meet growing power demands caused by AI https://sdtimes.com/ai/tech-companies-are-turning-to-nuclear-energy-to-meet-growing-power-demands-caused-by-ai/ Fri, 25 Oct 2024 16:57:14 +0000 https://sdtimes.com/?p=55907 The explosion in interest in AI, particularly generative AI, has had many positive benefits: increased productivity, easier and faster access to information, and often a better user experience in applications that have embedded AI chatbots.  But for all its positives, there is one huge problem that still needs solving: how do we power it all?  … continue reading

The post Tech companies are turning to nuclear energy to meet growing power demands caused by AI appeared first on SD Times.

]]>
The explosion in interest in AI, particularly generative AI, has had many positive benefits: increased productivity, easier and faster access to information, and often a better user experience in applications that have embedded AI chatbots. 

But for all its positives, there is one huge problem that still needs solving: how do we power it all? 

As of August of this year, ChatGPT had more than 200 million weekly active users, according to a report by Axios.  And it’s not just OpenAI; Google, Amazon, Apple, IBM, Meta, and many other players in tech have created their own AI models to better serve their customers and are investing heavily in AI strategies.

While people may generally be able to access these services for free, they’re not free in terms of the power they require. Research from Goldman Sachs indicates that a single ChatGPT query uses almost 10 times as much power as a Google search. 

Its research also revealed that by 2030, data center power demand will grow 160%. Relative to other energy demand categories, data centers will go from using 1-2% of total power to 3-4% by that same time, and by 2028, AI will represent 19% of the total power data center power demand.

Overall, the U.S. will see a 2.4% increase in energy demands every year through 2030, and will need to invest approximately $50 billion just to support its data centers. 

“Energy consumption in the United States has been pretty flat, really over the course of the last two decades,” Jason Carolan, chief innovation officer at Flexential, explained in a recent episode of ITOps Times’ podcast, Get With IT. “Part of that was that perhaps COVID sort of slowed things down. But now we’re at this point, whether it’s AI or whether it’s just electrification in general, that we’re really running out of capacity. In fact, there are states where projects of large scale, electrification builds, as well as data center builds, basically have stopped because there isn’t power capacity available.” 

To meet these growing demands, tech companies are turning to nuclear energy, and in the past month or so, Google, Microsoft, and Amazon have all announced investments in nuclear energy plants. 

On September 20, Microsoft announced that it had signed a 20 year deal with Constellation Energy to restart Three Mile Island Unit 1. This is a different reactor than the reactor (Unit 2) that caused the infamous Three Mile Island disaster in 1979, and this one had actually been restarted after the accident in 1985 and ran until 2019, when it shut down due to cost. 

Constellation and Microsoft say that the reactor should be back in operation by 2028 after improvements are made to the turbine, generator, main power transformer, and cooling and control systems. Constellation claims the reactor will generate around 835 megawatts of energy. 

“Powering industries critical to our nation’s global economic and technological competitiveness, including data centers, requires an abundance of energy that is carbon-free and reliable every hour of every day, and nuclear plants are the only energy sources that can consistently deliver on that promise,” said Joe Dominguez, president and CEO of Constellation.

Google and Amazon followed suit in October, both with news that they are investing in small modular reactors (SMR). SMRs generate less power than traditional reactors, typically around 100 to 300 megawatts compared to 1000 megawatts from a large-scale reactor, according to Carolan. Even though they generate less power, they also include more safety features, have a smaller footprint so that they can be installed in places where a large reactor couldn’t, and they cost less to build, according to the Office of Nuclear Energy.

“There’s been a lot of money and innovation put into small scale nuclear reactors over the course of the last four or five years, and there are several projects underway,” said Carolan. “There continues to be almost open-source-level innovation in the space because people are starting to share data points and share operational models.”

Google announced it had signed a deal with Kairo Power to purchase nuclear energy generated by their small modular reactors (SMR), revealing that Kairo’s first SMR should be online by 2030 and more SMRs will be deployed through 2025. Amazon also announced it partnering with energy companies in Washington and Virgina to develop SMRs there and invested in X-energy, which is a company developing SMR reactors and fuel.

“The grid needs new electricity sources to support AI technologies that are powering major scientific advances, improving services for businesses and customers, and driving national competitiveness and economic growth. This agreement helps accelerate a new technology to meet energy needs cleanly and reliably, and unlock the full potential of AI for everyone,” Michael Terrell, senior director of energy and climate at Google, wrote in the announcement. 

Carolan did note that SMRs are still a relatively new technology, and many of the designs have not yet been approved by the Nuclear Regulatory Commission. 

“I think we’re going to be in a little bit of a power gap here, in the course of the next two to three years as we continue to scale up nuclear,” he explained. As it stands now, as of April 2024, the U.S. only had 54 operating nuclear power plants, and in 2023, just 18.6% of our total power generation came from nuclear power. 

The post Tech companies are turning to nuclear energy to meet growing power demands caused by AI appeared first on SD Times.

]]>
Google expands Responsible Generative AI Toolkit with support for SynthID, a new Model Alignment library, and more https://sdtimes.com/ai/google-expands-responsible-generative-ai-toolkit-with-support-for-synthid-a-new-model-alignment-library-and-more/ Thu, 24 Oct 2024 16:17:54 +0000 https://sdtimes.com/?p=55901 Google is making it easier for companies to build generative AI responsibly by adding new tools and libraries to its Responsible Generative AI Toolkit. The Toolkit provides tools for responsible application design, safety alignment, model evaluation, and safeguards, all of which work together to improve the ability to responsibly and safely develop generative AI.  Google … continue reading

The post Google expands Responsible Generative AI Toolkit with support for SynthID, a new Model Alignment library, and more appeared first on SD Times.

]]>
Google is making it easier for companies to build generative AI responsibly by adding new tools and libraries to its Responsible Generative AI Toolkit.

The Toolkit provides tools for responsible application design, safety alignment, model evaluation, and safeguards, all of which work together to improve the ability to responsibly and safely develop generative AI. 

Google is adding the ability to watermark and detect text that is generated by an AI product using Google DeepMind’s SynthID technology. The watermarks aren’t visible to humans viewing the content, but can be seen by detection models to determine if content was generated by a particular AI tool. 

“Being able to identify AI-generated content is critical to promoting trust in information. While not a silver bullet for addressing problems such as misinformation or misattribution, SynthID is a suite of promising technical solutions to this pressing AI safety issue,” SynthID’s website states. 

The next addition to the Toolkit is the Model Alignment library, which allows the LLM to refine a user’s prompts based on specific criteria and feedback.  

“Provide feedback about how you want your model’s outputs to change as a holistic critique or a set of guidelines. Use Gemini or your preferred LLM to transform your feedback into a prompt that aligns your model’s behavior with your application’s needs and content policies,” Ryan Mullins, research engineer and RAI Toolkit tech lead at Google, wrote in a blog post

And finally, the last update is an improved developer experience in the Learning Interpretability Tool (LIT) on Google Cloud, which is a tool that provides insights into “how user, model, and system content influence generation behavior.”

It now includes a model server container, allowing developers to deploy Hugging Face or Keras LLMs on Google Cloud Run GPUs with support for generation, tokenization, and salience scoring. Users can also now connect to self-hosted models or Gemini models using the Vertex API. 

“Building AI responsibly is crucial. That’s why we created the Responsible GenAI Toolkit, providing resources to design, build, and evaluate open AI models. And we’re not stopping there! We’re now expanding the toolkit with new features designed to work with any LLMs, whether it’s Gemma, Gemini, or any other model. This set of tools and features empower everyone to build AI responsibly, regardless of the model they choose,” Mullins wrote. 

The post Google expands Responsible Generative AI Toolkit with support for SynthID, a new Model Alignment library, and more appeared first on SD Times.

]]>
Opsera extends AI Code Assistant Insights for developer productivity https://sdtimes.com/ai/opsera-extends-ai-code-assistant-insights-for-developer-productivity/ Wed, 23 Oct 2024 17:19:37 +0000 https://sdtimes.com/?p=55887 DevOps platform provider Opsera today announced AI Code Assistant Insights, empowering enterprises to improve developer productivity, impact, time savings and accelerate the ROI of their investment in AI Code Assistants. “IDC research finds that on average, developers estimate a 35% increase in their productivity with the use of an AI coding assistant. However, it is … continue reading

The post Opsera extends AI Code Assistant Insights for developer productivity appeared first on SD Times.

]]>
DevOps platform provider Opsera today announced AI Code Assistant Insights, empowering enterprises
to improve developer productivity, impact, time savings and accelerate the ROI of their
investment in AI Code Assistants.

“IDC research finds that on average, developers estimate a 35% increase in their productivity
with the use of an AI coding assistant. However, it is challenging to have visibility into adoption
and measure these gains across the organization,” said Katie Norton, Research Manager,
DevSecOps at IDC. “The metrics available in Opsera’s Unified Insights should enable
organizations to demonstrate the ROI of GitHub Copilot adoption, enhancing their ability to track
and quantify productivity improvements.”

For enterprises looking to proactively measure the increase in ROI of their AI Code Assistant
investments and improve productivity across all software delivery tools, teams, and
environments, the new AI Code Assistant Insights in the Opsera Unified DevOps Platform
provides actionable insights on developer-level productivity, pinpoints areas to improve adoption
and includes reporting on the quality and success of AI suggestions.

Users can:
● Unify metrics across the “Code to Cloud” journey, incorporating DevEx KPIs (time
to PR, lead time, cycle time, and performance), source code metrics (commits, PRs,
throughput, quality, and security), and DORA metrics (deployment frequency, change
failure rate, lead time, and MTTR). This comprehensive approach allows you to measure
impact, acceptance rate, and velocity effectively.
● Gain actionable insights into team performance, including throughput, quality, velocity,
security, and stability, as well as developer-level metrics, to pinpoint areas for
improvement and optimize processes.
● Seamlessly integrate with leading AI code assistants like GitHub Copilot and Amazon
Q, offering a unique, holistic “single pane of glass” view of the entire development
lifecycle.

“AI Code Assistants are critical for developer productivity and efficiency, and we are proud to
enable engineering teams to adopt them and realize their benefits faster than ever before and
provide metrics on the positive impact,” said Kumar Chivukula, co-founder and CEO of Opsera.
“With Opsera’s Unified DevOps Platform, we provide persona and team-level insights, pinpoint
bottlenecks and inefficiencies using Opsera Hummingbird AI, and measure security and quality
across tools to help enterprises improve overall developer productivity and experience.”

Unlike other platforms, Opsera integrates with the entire software development lifecycle with
over 100 native integrations and unified data for SDLC, IaC, SaaS applications like Salesforce,
Databricks, and Snowflake, and mobile application development. This helps teams maximize
their investment and provides the most comprehensive

The post Opsera extends AI Code Assistant Insights for developer productivity appeared first on SD Times.

]]>
Anthropic releases updated version of Claude 3.5 Sonnet and first release of Claude 3.5 Haiku https://sdtimes.com/ai/anthropic-releases-updated-version-of-claude-3-5-sonnet-and-first-release-of-claude-3-5-haiku/ Wed, 23 Oct 2024 15:46:19 +0000 https://sdtimes.com/?p=55880 Anthropic has a number of updates to share about its AI models, including an updated version of Claude 3.5 Sonnet, the release of Claude 3.5 Haiku, and a public beta for a capability that enables users to instruct Claude to use computers as a human would.  The new version of Claude 3.5 Sonnet features improvements … continue reading

The post Anthropic releases updated version of Claude 3.5 Sonnet and first release of Claude 3.5 Haiku appeared first on SD Times.

]]>
Anthropic has a number of updates to share about its AI models, including an updated version of Claude 3.5 Sonnet, the release of Claude 3.5 Haiku, and a public beta for a capability that enables users to instruct Claude to use computers as a human would. 

The new version of Claude 3.5 Sonnet features improvements across the board compared to the original version. It outperforms the original in graduate level reasoning, undergraduate level knowledge, code, math problem solving, high school math competition, visual question answering, agentic coding, and agentic tool use.

“Early customer feedback suggests the upgraded Claude 3.5 Sonnet represents a significant leap for AI-powered coding,” Anthropic wrote in a post. The company also revealed that GitLab tested the model for DevSecOps tasks and found up to a 10% improvement in reasoning across different use cases. 

Claude 3.5 Haiku is the company’s fastest model, and has a similar cost and speed compared to Claude 3 Haiku, but improves across every skill set, even outperforming the previous generation’s largest model, Claude 3 Opus, in many benchmarks.

According to Anthropic, Claude 3.5 Haiku does especially well in coding tasks, scoring 40.6 on SWE-bench, which is a benchmark that evaluates how well a model can reason through GitHub issues. This is better than the original Claude 3.5 Sonnet and GPT-4o, the company claims. 

“With low latency, improved instruction following, and more accurate tool use, Claude 3.5 Haiku is well suited for user-facing products, specialized sub-agent tasks, and generating personalized experiences from huge volumes of data—like purchase history, pricing, or inventory records,” Anthropic wrote.

Claude 3.5 Haiku will be available in a few weeks through Anthropic’s API, Amazon Bedrock, and Google Cloud’s Vertex AI. It will first be available as a text-only model, and image input will be added down the line. 

Beyond its model announcements, Anthropic also announced the public beta for a new capability that enables Claude to do general computer skills. It built an API that allows the model to perceive and interact with computer interfaces, enabling it to complete tasks like moving the cursor to open an application, navigating to specific web pages, or filling out a form with data from those pages.

In early testing via the OSWorld benchmark, which evaluates an AI’s ability to use computers like humans, Claude 3.5 Sonnet scored 14.9% in the screenshot-only category, which is the highest score of any model (the next highest score is 7.8%). Additionally, when given more steps to complete a task, Claude scored 22%.

Anthropic noted that some of the areas that Claude struggles with include scrolling, dragging, and zooming, and therefore recommends people experiment with it on low-risk tasks.

“Learning from the initial deployments of this technology, which is still in its earliest stages, will help us better understand both the potential and the implications of increasingly capable AI systems,” Anthropic wrote. 

The post Anthropic releases updated version of Claude 3.5 Sonnet and first release of Claude 3.5 Haiku appeared first on SD Times.

]]>
IBM releases next generation of Granite LLMs https://sdtimes.com/ai/ibm-releases-next-generation-of-granite-llms/ Mon, 21 Oct 2024 17:53:08 +0000 https://sdtimes.com/?p=55871 IBM has announced the third-generation of its open source Granite LLM family, which features a number of different models ideal for various use cases.  “Reflecting our focus on the balance between powerful and practical, the new IBM Granite 3.0 models deliver state-of-the-art performance relative to model size while maximizing safety, speed and cost-efficiency for enterprise … continue reading

The post IBM releases next generation of Granite LLMs appeared first on SD Times.

]]>
IBM has announced the third-generation of its open source Granite LLM family, which features a number of different models ideal for various use cases. 

“Reflecting our focus on the balance between powerful and practical, the new IBM Granite 3.0 models deliver state-of-the-art performance relative to model size while maximizing safety, speed and cost-efficiency for enterprise use cases,” IBM wrote in a blog post.

The Granite 3.0 family includes general purpose models, more guardrail and safety focused ones, and mixture-of-experts models. 

The main model in this family is Granite 3.0 8B Instruct, an instruction-tuned, dense decoder-only model that offers strong performance in RAG, classification, summarization, entity extraction, and tool use. It matches open models of similar sizes on academic benchmarks and exceeds them for enterprise tasks and safety, according to IBM.

“Trained using a novel two-phase method on over 12 trillion tokens of carefully vetted data across 12 different natural languages and 116 different programming languages, the developer-friendly Granite 3.0 8B Instruct is a workhorse enterprise model intended to serve as a primary building block for sophisticated workflows and tool-based use cases,” IBM wrote.

This release also includes new Granite Guardian models that safeguard against social bias, hate, toxicity, profanity, violence, and jailbreaking, as well as perform RAG-specific checks like groundedness, context relevant, and answer relevance.  

There are also a number of other models in the Granite 3.0 family, including: 

  • Granite-3.0-8B-Base, Granite-3.0-2B-Instruct and Granite-3.0-2B-Base, which are general purpose LLMs
  • Granite-3.0-3B-A800M-Instruct and Granite-3.0-1B-A400M-Instruct, which are Mixture of Experts models that minimize latency and cost
  • Granite-3.0-8B-Instruct-Accelerator, which are speculative decoders that offer better speed and efficiency

All of the models are available under the Apache 2.0 license on Hugging Face, and Granite 3.0 8B and 2B and Granite Guardian 3.0 8B and 2B are available for commercial use on watsonx. 

The company also revealed that by the end of 2024, it plans to expand all model context windows to 128K tokens, further improve multilingual support, and introduce multimodal image-in, text-out capabilities. 

And in addition to releasing these new Granite models, the company also revealed the upcoming availability of the newest version of the watsonx Code Assistant, as well as plans to release new tools for developers building, customizing, and deploying AI through watsonx.ai.

The post IBM releases next generation of Granite LLMs appeared first on SD Times.

]]>
Microsoft 365 Copilot allows users to create their own autonomous agents https://sdtimes.com/msft/microsoft-365-copilot-allows-users-to-create-their-own-autonomous-agents/ Mon, 21 Oct 2024 16:08:17 +0000 https://sdtimes.com/?p=55866 Microsoft is continuing to improve generative AI across Windows with new updates to Microsoft 365 Copilot. The company has announced that the ability for users to create their own autonomous agents in Copilot Studio is moving from private to public preview next month. Agents can be triggered by specific events and act on their own, … continue reading

The post Microsoft 365 Copilot allows users to create their own autonomous agents appeared first on SD Times.

]]>
Microsoft is continuing to improve generative AI across Windows with new updates to Microsoft 365 Copilot.

The company has announced that the ability for users to create their own autonomous agents in Copilot Studio is moving from private to public preview next month.

Agents can be triggered by specific events and act on their own, rather than being activated by a conversation. For instance, when an email arrives, an agent can be activated to look up the sender’s details and account, see previous communications, check inventory, ask the sender their preferences, and then take the necessary actions to close a ticket. 

“Think of agents as the new apps for an AI-powered world. Every organization will have a constellation of agents — ranging from simple prompt-and-response to fully autonomous. They will work on behalf of an individual, team or function to execute and orchestrate business processes. Copilot is how you’ll interact with these agents, and they’ll do everything from accelerating lead generation and processing sales orders to automating your supply chain,” Microsoft wrote in its announcement

The company shared some examples of customers that have already built their own autonomous agents, including Pets at Home, who created an agent to compile cases for human review; McKinsey & Company, who created an agent to speed up client onboarding; and Thomson Reuters, who created an agent to speed up its legal due diligence process.

Microsoft is also releasing ten new autonomous agents in Dynamics 365, which is its enterprise resource planning (ERP) and customer relationship management (CRM) software.

New agents spans sales, service, finance, and supply chain use cases, and include examples like:

  • Sales Qualification Agent, which researches leads, prioritizes opportunities, and guides customer outreach
  • Supplier Communications Agent, which tracks supplier performance to detect and respond to delays
  • Customer Intent and Customer Knowledge Management Agents, which learn from customer service representatives how to resolve customer issues and add articles to a company’s knowledge base. 

The post Microsoft 365 Copilot allows users to create their own autonomous agents appeared first on SD Times.

]]>