Access Denied You don’t have permission to access “http://zeenews.india.com/technology/whatsapp-prepaid-mobile-recharge-feature-launched-here-s-how-you-can-use-it-3040344.html” on this server. Reference #18.eff43717.1776938971.c47339 https://errors.edgesuite.net/18.eff43717.1776938971.c47339
Access Denied
Access Denied You don’t have permission to access “http://zeenews.india.com/technology/motorola-edge-70-pro-launched-in-india-with-6500mah-battery-check-price-camera-features-and-performance-3040060.html” on this server. Reference #18.54fdd417.1776882089.4fcda2b https://errors.edgesuite.net/18.54fdd417.1776882089.4fcda2b
Getting Started with Zero-Shot Text Classification
In this article, you will learn how zero-shot text classification works and how to apply it using a pretrained transformer model. Topics we will cover include: The core idea behind zero-shot classification and how it reframes labeling as a reasoning task. How to use a pretrained model to classify text without task-specific training data. Practical techniques such as multi-label classification and hypothesis template tuning. Let’s get started. Getting Started with Zero-Shot Text ClassificationImage by Editor Introduction Zero-shot text classification is a way to label text without first training a classifier on your own task-specific dataset. Instead of collecting examples for every category you want, you provide the model with a piece of text and a list of possible labels. The model then decides which label fits best based on its general language understanding. This makes zero-shot classification especially useful when you want to test an idea quickly, work with changing label sets, or build a lightweight prototype before investing in supervised training. Rather than learning a fixed mapping from text to label IDs, the model uses language itself to reason about what each label means. In this guide, we will understand the main idea behind zero-shot classification and run practical examples using facebook/bart-large-mnli. How Zero-Shot Works The core idea behind zero-shot classification is that the model does not treat labels as simple category names. Instead, it turns each label into a short natural-language statement and checks whether that statement is supported by the input text. This makes it especially useful in practical situations where you want to classify text quickly without collecting and labeling training data first, such as routing support tickets, tagging articles, detecting user intent, or organizing internal documents. For example, suppose the input text is: text = “The company launched a new AI platform for enterprise customers.” text = “The company launched a new AI platform for enterprise customers.” And the candidate labels are: labels = [“technology”, “sports”, “finance”] labels = [“technology”, “sports”, “finance”] The model conceptually turns these into statements like: This text is about technology. This text is about sports. This text is about finance. It then compares the original text against each of these statements and scores how well they match. The label with the strongest score is ranked highest. The same idea can be applied to many real tasks. Instead of broad topic labels, a company might use labels such as billing issue, technical support, or refund request for customer service messages, or spam, harassment, and safe for moderation systems. So the important shift is this: zero-shot classification is not really treated as a traditional classification problem. It is treated more like a reasoning problem about whether a label description fits the text. That is also why it works well for fast prototyping, low-resource tasks, and domains where labeled data does not yet exist. This is why wording matters. A label like billing issue often works better than a vague label like money, because the model has more semantic meaning to work with. In real use cases, clearer labels usually lead to better performance, whether you are classifying news topics, customer intents, moderation categories, or business workflows. Seeing the Zero-Shot Model in Action In this section, we will learn how to load a zero-shot classifier, run a basic example, test multi-label predictions, and improve results with a custom hypothesis template. 1. Load the Zero-Shot Classification Pipeline First, install the required libraries: pip install torch transformers pip install torch transformers Now load the pipeline: from transformers import pipeline classifier = pipeline( “zero-shot-classification”, model=”facebook/bart-large-mnli” ) from transformers import pipeline classifier = pipeline( “zero-shot-classification”, model=“facebook/bart-large-mnli” ) Loading the Transformers pipeline Here, the pipeline gives you an easy way to use a pretrained zero-shot model without writing lower-level inference code yourself. The model used here, facebook/bart-large-mnli, is commonly used for zero-shot classification because it is trained to determine whether one piece of text supports another. 2. Run a Simple Zero-Shot Example Let’s start with a basic example: text = “This tutorial explains how transformer models are used in NLP.” candidate_labels = [“technology”, “health”, “sports”, “finance”] result = classifier(text, candidate_labels) print(f”Top prediction: {result[‘labels’][0]} ({result[‘scores’][0]:.2%})”) text = “This tutorial explains how transformer models are used in NLP.” candidate_labels = [“technology”, “health”, “sports”, “finance”] result = classifier(text, candidate_labels) print(f“Top prediction: {result[‘labels’][0]} ({result[‘scores’][0]:.2%})”) Output: Top prediction: technology (96.52%) Top prediction: technology (96.52%) This shows the model selecting the label that best matches the meaning of the text. Since the sentence discusses transformer models and natural language processing, technology is the strongest semantic fit among the candidate labels. 3. Classifying Text into Multiple Labels Sometimes a text belongs to more than one category. In that case, you can enable multi_label=True: text = “The company launched a health app and announced strong business growth.” candidate_labels = [“technology”, “healthcare”, “business”, “travel”] result = classifier( text, candidate_labels, multi_label=True ) threshold = 0.50 top_labels = [ (label, score) for label, score in zip(result[“labels”], result[“scores”]) if score >= threshold ] print(“Top labels:”, “, “.join(f”{label} ({score:.2%})” for label, score in top_labels)) 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 text = “The company launched a health app and announced strong business growth.” candidate_labels = [“technology”, “healthcare”, “business”, “travel”] result = classifier( text, candidate_labels, multi_label=True ) threshold = 0.50 top_labels = [ (label, score) for label, score in zip(result[“labels”], result[“scores”]) if score >= threshold ] print(“Top labels:”, “, “.join(f“{label} ({score:.2%})” for label, score in top_labels)) Output: Top labels: healthcare (99.41%), technology (99.06%), business (98.15%) Top labels: healthcare (99.41%), technology (99.06%), business (98.15%) This is useful when multiple labels can apply to the same input. In this example, the sentence is not only about technology but also about healthcare and business, so the model assigns strong scores to all three labels. 4. Customizing the Hypothesis Template You can also change the hypothesis template. The pipeline uses a default phrasing internally, but a clearer or more natural
Access Denied
Access Denied You don’t have permission to access “http://zeenews.india.com/technology/india-brings-ai-services-to-villages-with-unique-dpi-report-3039093.html” on this server. Reference #18.5cfdd417.1776803277.2af00c76 https://errors.edgesuite.net/18.5cfdd417.1776803277.2af00c76
AI Agent Memory Explained in 3 Levels of Difficulty
In this article, you will learn how AI agent memory works across working memory, external memory, and scalable memory architectures for building agents that improve over time. Topics we will cover include: The memory problem in stateless large language model-based agents. How in-context, episodic, semantic, and procedural memory support agent behavior. How retrieval, memory writing, decay handling, and multi-agent consistency make memory work at scale. AI Agent Memory Explained in 3 Levels of DifficultyImage by Author Introduction A stateless AI agent has no memory of previous calls. Every request starts from scratch. This works fine for isolated tasks, but it becomes a problem when an agent needs to track decisions, remember user preferences, or pick up where it left off. The challenge is that memory in AI agents is a collection of different mechanisms that serve different purposes. These mechanisms also operate at different timescales — some are scoped to a single conversation, while others persist indefinitely. How you combine them determines whether your agent stays useful across sessions. This article explains AI agent memory at three levels: what memory means for an agent and why it is hard, how the main memory types work in practice, and finally, the architectural patterns and retrieval strategies that make persistent, reliable memory work at scale. Level 1: Understanding The Memory Problem In AI Agents A large language model has no persistent state. Every call to the API is stateless: the model receives a block of text, or context window, processes it, returns a response, and retains nothing. There is no internal store being updated between calls. This is fine for answering a one-off question. It is a fundamental problem for anything agent-like: a system that takes multi-step actions, learns from feedback, or coordinates work across many sessions. The following four questions make the memory problem concrete: What happened before? An agent that books calendar events needs to know what is already scheduled. If it does not remember, it double-books. What does this user want? A writing assistant that does not remember your preferred tone and style resets to generic behavior every session. What has the agent already tried? A research agent that does not remember failed search queries will repeat the same dead ends. What facts has the agent accumulated? An agent that discovers mid-task that a file is missing needs to record that and factor it into future steps. The memory problem is the problem of giving an inherently stateless system the ability to behave as if it has persistent, queryable knowledge about the past. Level 2: The Types Of Agent Memory In-Context Memory Or Working Memory The simplest form: everything in the context window right now. The conversation history, tool call results, system prompt, relevant documents — all of it gets passed to the model as text on every call. This is exact and immediate. The model can reason over anything in context with high fidelity. There is no retrieval step, no approximation, and no chance of pulling the wrong record. The constraint is context window size. Current models support 128K to 1M tokens, but costs and latency scale with length, so you cannot simply dump everything in and call it done. In practice, in-context memory works best for the active state of a task: the current conversation, recent tool outputs, and documents directly relevant to the immediate step. External Memory For information too large, too old, or too dynamic to keep in context at all times, agents query an external store and pull in what is relevant when needed. This is retrieval-augmented generation (RAG) applied to agent memory. Two retrieval patterns serve different needs: Semantic search over a vector database finds records similar in meaning to the current query… Exact lookup against a relational or key-value store retrieves structured facts by attribute — user preferences, task state, prior decisions, and entity records. Agent memory retrieval step In practice, the most robust agent memory systems use both in combination: run a vector search and a structured query as needed, then merge the results. Level 3 focuses on making memory systems work in real-world production. It goes beyond basic memory types and tackles practical challenges: how to structure memory more granularly, what information to store and when, how to reliably retrieve the right data at scale, and how to handle issues like stale data or multiple agents writing to the same system. In short, it’s about the architecture and strategies that ensure memory actually improves an agent’s performance. Level 3: AI Agent Memory Architecture At Scale What Needs To Be Stored Not all information deserves the same treatment, and it’s worth being precise about what you’re actually storing. Agent memory naturally falls into a few categories: Episodic memory captures what happened: specific events, tool calls, and their outcomes. Semantic memory captures what is true: facts and preferences extracted from experience. Procedural memory captures how to do things. It encodes learned action patterns, successful strategies, and known failure modes. An overview of AI agent memory types Writing To Memory: When And What To Store An agent that writes every token of every interaction to memory produces noise at scale. Memory has to be selective. The following are two common patterns: End-of-session summarization: After each session, the agent or a dedicated summarization step extracts salient facts, decisions, and outcomes and writes them as compact memory records. Event-triggered writes: Certain events explicitly trigger memory writes — user corrections, explicit preference statements, task completions, and error conditions. What not to store: raw transcripts at scale, intermediate reasoning traces that do not affect future behavior, or redundant duplicates of existing records. Retrieving From Memory: Getting The Right Context Here is an overview of the three main retrieval strategies: Vector similarity search queries the memory store with an embedding of the current context and returns the top-K most semantically similar records. This is fast, approximate, and works well for unstructured memory. It also requires an embedding model and a vector index like HNSW or IVF-based. Quality
Access Denied
Access Denied You don’t have permission to access “http://zeenews.india.com/technology/oneplus-nord-ce-6-nord-ce-6-lite-india-launch-confirmed-on-this-date-what-to-expect-3039227.html” on this server. Reference #18.eff43717.1776791459.cd5a48fd https://errors.edgesuite.net/18.eff43717.1776791459.cd5a48fd
Access Denied
Access Denied You don’t have permission to access “http://zeenews.india.com/technology/why-do-keyboards-have-bumps-on-f-and-j-keys-here-s-the-hidden-reason-behind-it-3039296.html” on this server. Reference #18.eff43717.1776782971.cac53d44 https://errors.edgesuite.net/18.eff43717.1776782971.cac53d44
Structured Outputs vs. Function Calling: Which Should Your Agent Use?
In this article, you will learn the architectural differences between structured outputs and function calling in modern language model systems. Topics we will cover include: How structured outputs and function calling work under the hood. When to use each approach in real-world machine learning systems. The performance, cost, and reliability trade-offs between the two. Structured Outputs vs. Function Calling: Which Should Your Agent Use?Image by Editor Introduction Language models (LMs), at their core, are text-in and text-out systems. For a human conversing with one via a chat interface, this is perfectly fine. But for machine learning practitioners building autonomous agents and reliable software pipelines, raw unstructured text is a nightmare to parse, route, and integrate into deterministic systems. To build reliable agents, we need predictable, machine-readable outputs and the ability to interact seamlessly with external environments. In order to bridge this gap, modern LM API providers (like OpenAI, Anthropic, and Google Gemini) have introduced two primary mechanisms: Structured Outputs: Forcing the model to reply by adhering exactly to a predefined schema (most commonly a JSON schema or a Python Pydantic model) Function Calling (Tool Use): Equipping the model with a library of functional definitions that it can choose to invoke dynamically based on the context of the prompt At first glance, these two capabilities look very similar. Both typically rely on passing JSON schemas to the API under the hood, and both result in the model outputting structured key-value pairs instead of conversational prose. However, they serve fundamentally different architectural purposes in agent design. Conflating the two is a common pitfall. Choosing the wrong mechanism for a feature can lead to brittle architectures, excessive latency, and unnecessarily inflated API costs. Let’s unpack the architectural distinctions between these methods and provide a decision-making framework for when to use each. Unpacking the Mechanics: How They Work Under the Hood To understand when to use these features, it is necessary to understand how they differ at the mechanical and API levels. Structured Outputs Mechanics Historically, getting a model to output raw JSON relied on prompt engineering (“You are a helpful assistant that *only* speaks in JSON…”). This was error-prone, requiring extensive retry logic and validation. Modern “structured outputs” fundamentally change this through grammar-constrained decoding. Libraries like Outlines, or native features like OpenAI’s Structured Outputs, mathematically restrict the token probabilities at generation time. If the chosen schema dictates that the next token must be a quotation mark or a specific boolean value, the probabilities of all non-compliant tokens are masked out (set to zero). This is a single-turn generation strictly focused on form. The model is answering the prompt directly, but its vocabulary is confined to the exact structure you defined, with the aim of ensuring near 100% schema compliance. Function Calling Mechanics Function calling, on the other hand, relies heavily on instruction tuning. During training, the model is fine-tuned to recognize situations where it lacks the necessary information to complete a prompt, or when the prompt explicitly asks it to take an action. When you provide a model with a list of tools, you are telling it, “If you need to, you can pause your text generation, select a tool from this list, and generate the necessary arguments to run it.” This is an inherently multi-turn, interactive flow: The model decides to call a tool and outputs the tool name and arguments. The model pauses. It cannot execute the code itself. Your application code executes the selected function locally using the generated arguments. Your application returns the result of the function back to the model. The model synthesizes this new information and continues generating its final response. When to Choose Structured Outputs Structured outputs should be your default approach whenever the goal is pure data transformation, extraction, or standardization. Primary Use Case: The model has all the necessary information within the prompt and context window; it just needs to reshape it. Examples for Practitioners: Data Extraction (ETL): Processing raw, unstructured text like a customer support transcript and extracting entities &emdash; names, dates, complaint types, and sentiment scores &emdash; into a strict database schema. Query Generation: Converting a messy natural language user prompt into a strict, validated SQL query or a GraphQL payload. If the schema is broken, the query fails, making 100% adherence critical. Internal Agent Reasoning: Structuring an agent’s “thoughts” before it acts. You can enforce a Pydantic model that requires a thought_process field, an assumptions field, and finally a decision field. This forces a Chain-of-Thought process that is easily parsed by your backend logging systems. The Verdict: Use structured outputs when the “action” is simply formatting. Because there is no mid-generation interaction with external systems, this approach ensures high reliability, lower latency, and zero schema-parsing errors. When to Choose Function Calling Function calling is the engine of agentic autonomy. If structured outputs dictate the shape of the data, function calling dictates the control flow of the application. Primary Use Case: External interactions, dynamic decision-making, and cases where the model needs to fetch information it doesn’t currently possess. Examples for Practitioners: Executing Real-World Actions: Triggering external APIs based on conversational intent. If a user says, “Book my usual flight to New York,” the model uses function calling to trigger the book_flight(destination=”JFK”) tool. Retrieval-Augmented Generation (RAG): Instead of a naive RAG pipeline that always searches a vector database, an agent can use a search_knowledge_base tool. The model dynamically decides what search terms to use based on the context, or decides not to search at all if it already knows the answer. Dynamic Task Routing: For complex systems, a router model might use function calling to select the best specialized sub-agent (e.g., calling delegate_to_billing_agent versus delegate_to_tech_support) to handle a specific query. The Verdict: Choose function calling when the model must interact with the outside world, fetch hidden data, or conditionally execute software logic mid-thought. Performance, Latency, and Cost Implications When deploying agents to production, the architectural choice between these two methods directly impacts your unit economics and user experience. Token Consumption: Function calling
Access Denied
Access Denied You don’t have permission to access “http://zeenews.india.com/technology/tech-giants-samsung-and-lg-hike-laptop-prices-twice-in-three-months-3039642.html” on this server. Reference #18.c4f43717.1776774269.d9f55b7 https://errors.edgesuite.net/18.c4f43717.1776774269.d9f55b7
How to Implement Tool Calling with Gemma 4 and Python
In this article, you will learn how to build a local, privacy-first tool-calling agent using the Gemma 4 model family and Ollama. Topics we will cover include: An overview of the Gemma 4 model family and its capabilities. How tool calling enables language models to interact with external functions. How to implement a local tool calling system using Python and Ollama. How to Implement Tool Calling with Gemma 4 and PythonImage by Editor Introducing the Gemma 4 Family The open-weights model ecosystem shifted recently with the release of the Gemma 4 model family. Built by Google, the Gemma 4 variants were created with the intention of providing frontier-level capabilities under a permissive Apache 2.0 license, enabling machine learning practitioners complete control over their infrastructure and data privacy. The Gemma 4 release features models ranging from the parameter-dense 31B and structurally complex 26B Mixture of Experts (MoE) to lightweight, edge-focused variants. More importantly for AI engineers, the model family features native support for agentic workflows. They have been fine-tuned to reliably generate structured JSON outputs and natively invoke function calls based on system instructions. This transforms them from “fingers crossed” reasoning engines into practical systems capable of executing workflows and conversing with external APIs locally. Tool Calling in Language Models Language models began life as closed-loop conversationalists. If you asked a language model for real-world sensor reading or live market rates, it could at best apologize, and at worst, hallucinate an answer. Tool calling, aka function calling, is the foundational architecture shift required to fix this gap. Tool calling serves as the bridge that can help transform static models into dynamic autonomous agents. When tool calling is enabled, the model evaluates a user prompt against a provided registry of available programmatic tools (supplied via JSON schema). Rather than attempting to guess the answer using only internal weights, the model pauses inference, formats a structured request specifically designed to trigger an external function, and awaits the result. Once the result is processed by the host application and handed back to the model, the model synthesizes the injected live context to formulate a grounded final response. The Setup: Ollama and Gemma 4:E2B To build a genuinely local, private-first tool calling system, we will use Ollama as our local inference runner, paired with the gemma4:e2b (Edge 2 billion parameter) model. The gemma4:e2b model is built specifically for mobile devices and IoT applications. It represents a paradigm shift in what is possible on consumer hardware, activating an effective 2 billion parameter footprint during inference. This optimization preserves system memory while achieving near-zero latency execution. By executing entirely offline, it removes rate limits and API costs while preserving strict data privacy. Despite this incredibly small size, Google has engineered gemma4:e2b to inherit the multimodal properties and native function-calling capabilities of the larger 31B model, making it an ideal foundation for a fast, responsive desktop agent. It also allows us to test for the capabilities of the new model family without requiring a GPU. The Code: Setting Up the Agent To orchestrate the language model and the tool interfaces, we will rely on a zero-dependency philosophy for our implementation, leveraging only standard Python libraries like urllib and json, ensuring maximum portability and transparency while also avoiding bloat. The complete code for this tutorial can be found at this GitHub repository. The architectural flow of our application operates in the following way: Define local Python functions that act as our tools Define a strict JSON schema that explains to the language model exactly what these tools do and what parameters they expect Pass the user’s query and the tool registry to the local Ollama API Catch the model’s response, identify if it requested a tool call, execute the corresponding local code, and feed the answer back Building the Tools: get_current_weather Let’s dive into the code, keeping in mind that our agent’s capability rests on the quality of its underlying functions. Our first function is get_current_weather, which reaches out to the open-source Open-Meteo API to resolve real-time weather data for a specific location. def get_current_weather(city: str, unit: str = “celsius”) -> str: “””Gets the current temperature for a given city using open-meteo API.””” try: # Geocode the city to get latitude and longitude geo_url = f”https://geocoding-api.open-meteo.com/v1/search?name={urllib.parse.quote(city)}&count=1″ geo_req = urllib.request.Request(geo_url, headers={‘User-Agent’: ‘Gemma4ToolCalling/1.0’}) with urllib.request.urlopen(geo_req) as response: geo_data = json.loads(response.read().decode(‘utf-8’)) if “results” not in geo_data or not geo_data[“results”]: return f”Could not find coordinates for city: {city}.” location = geo_data[“results”][0] lat = location[“latitude”] lon = location[“longitude”] country = location.get(“country”, “”) # Fetch the weather temp_unit = “fahrenheit” if unit.lower() == “fahrenheit” else “celsius” weather_url = f”https://api.open-meteo.com/v1/forecast?latitude={lat}&longitude={lon}¤t=temperature_2m,wind_speed_10m&temperature_unit={temp_unit}” weather_req = urllib.request.Request(weather_url, headers={‘User-Agent’: ‘Gemma4ToolCalling/1.0’}) with urllib.request.urlopen(weather_req) as response: weather_data = json.loads(response.read().decode(‘utf-8’)) if “current” in weather_data: current = weather_data[“current”] temp = current[“temperature_2m”] wind = current[“wind_speed_10m”] temp_unit_str = weather_data[“current_units”][“temperature_2m”] wind_unit_str = weather_data[“current_units”][“wind_speed_10m”] return f”The current weather in {city.title()} ({country}) is {temp}{temp_unit_str} with wind speeds of {wind}{wind_unit_str}.” else: return f”Weather data for {city} is unavailable from the API.” except Exception as e: return f”Error fetching weather for {city}: {e}” 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 def get_current_weather(city: str, unit: str = “celsius”) -> str: “”“Gets the current temperature for a given city using open-meteo API.”“” try: # Geocode the city to get latitude and longitude geo_url = f“https://geocoding-api.open-meteo.com/v1/search?name={urllib.parse.quote(city)}&count=1” geo_req = urllib.request.Request(geo_url, headers={‘User-Agent’: ‘Gemma4ToolCalling/1.0’}) with urllib.request.urlopen(geo_req) as response: geo_data = json.loads(response.read().decode(‘utf-8’)) if “results” not in geo_data or not geo_data[“results”]: return f“Could not find coordinates for city: {city}.” location = geo_data[“results”][0] lat = location[“latitude”] lon = location[“longitude”] country = location.get(“country”, “”) # Fetch the weather temp_unit = “fahrenheit” if unit.lower() == “fahrenheit” else “celsius” weather_url = f“https://api.open-meteo.com/v1/forecast?latitude={lat}&longitude={lon}¤t=temperature_2m,wind_speed_10m&temperature_unit={temp_unit}” weather_req = urllib.request.Request(weather_url, headers={‘User-Agent’: ‘Gemma4ToolCalling/1.0’}) with urllib.request.urlopen(weather_req) as response: weather_data = json.loads(response.read().decode(‘utf-8’)) if “current” in weather_data: current = weather_data[“current”] temp = current[“temperature_2m”] wind = current[“wind_speed_10m”] temp_unit_str = weather_data[“current_units”][“temperature_2m”] wind_unit_str = weather_data[“current_units”][“wind_speed_10m”]