Ứng dụng ChatGPT + Claude vào Business Analyst

ChatGPT và Claude là hai mô hình ngôn ngữ lớn (Large Language Models – LLMs) được phát triển bởi OpenAI và Anthropic. Chúng có khả năng hiểu và tạo ra văn bản tự nhiên, xử lý thông tin phức tạp và hỗ trợ đa dạng các tác vụ liên quan đến ngôn ngữ và phân tích.

Lợi ích của việc sử dụng ChatGPT và Claude trong BA:

  1. Tăng tốc quá trình thu thập yêu cầu: Sử dụng AI để tạo ra các câu hỏi phỏng vấn, phân tích use cases và tổng hợp thông tin từ các bên liên quan.
  2. Cải thiện chất lượng tài liệu kỹ thuật: Tận dụng khả năng viết và chỉnh sửa của AI để tạo ra các tài liệu như đặc tả yêu cầu phần mềm, tài liệu thiết kế hệ thống và tài liệu API.
  3. Hỗ trợ phân tích và thiết kế: Sử dụng AI để hỗ trợ trong việc phân tích hệ thống, tạo ra các mô hình dữ liệu và thiết kế giao diện người dùng.
  4. Tối ưu hóa quy trình phát triển phần mềm: Áp dụng AI để phân tích và đề xuất cải tiến quy trình phát triển, testing và triển khai.
  5. Hỗ trợ ra quyết định kỹ thuật: Sử dụng AI như một công cụ hỗ trợ trong việc đánh giá các giải pháp kỹ thuật và lựa chọn công nghệ.

Để tận dụng tối đa khả năng của ChatGPT và Claude trong công việc của Business Analyst (BA) trong lĩnh vực kỹ thuật phần mềm, bạn có thể sử dụng cấu trúc prompt sau:

You are a [level] Business Analyst with [number of years] experience in software development, specializing in [specific domain/industry]. Your task is to [mô tả nhiệm vụ cụ thể]. [Example of the task or context (nếu cần)]. Please provide your response [rule response (nếu cần)].

Giải thích các thành phần:

  1. You are: Bắt đầu prompt bằng cách xác định vai trò cho AI.
  2. [level]: Xác định cấp độ kinh nghiệm, ví dụ: “Senior”, “Lead”, “Principal”.
  3. Business Analyst: Xác định vai trò cụ thể là Business Analyst.
  4. [number of years]: Chỉ định số năm kinh nghiệm, ví dụ: “5 years”, “10 years”.
  5. specializing in [specific domain/industry]: Xác định lĩnh vực hoặc ngành công nghiệp cụ thể, ví dụ: “e-commerce platforms”, “financial services software”, “healthcare IT systems”.
  6. Your task is to [mô tả nhiệm vụ cụ thể]: Mô tả chi tiết nhiệm vụ hoặc vấn đề cần giải quyết.
  7. [Example of the task or context] (tùy chọn): Cung cấp ví dụ cụ thể hoặc bối cảnh nếu cần.
  8. Please provide your response [rule response] (tùy chọn): Đặt ra các quy tắc hoặc ràng buộc cho câu trả lời, ví dụ: “in bullet points”, “with a focus on user stories”, “including a simple process diagram”.

Ví dụ sử dụng:

  1. Phân tích yêu cầu:
    You are a Senior Business Analyst with 8 years of experience in software development, specializing in e-commerce platforms. Your task is to analyze and refine the requirements for a new product recommendation feature. The feature should suggest products based on user browsing history and purchase patterns. Please provide your response in the form of user stories with acceptance criteria.
  2. Tạo tài liệu đặc tả:
    You are a Lead Business Analyst with 10 years of experience in software development, specializing in financial services software. Your task is to create a high-level software requirements specification (SRS) for a new mobile banking app. The app should include features such as account management, fund transfers, and bill payments. Please structure your response as an outline with main sections and key points for each section.
  3. Phân tích quy trình:
    You are a Principal Business Analyst with 15 years of experience in software development, specializing in healthcare IT systems. Your task is to analyze the current patient admission process in a hospital and propose improvements that can be implemented through software solutions. Please provide your response with a brief description of the current process, identified pain points, and suggested improvements, including a simple process flow diagram for the proposed solution.

What is OpenAI o1? And how does it compare to GPT-4o?

OpenAI announced the release of its new series of AI models—OpenAI o1, with significantly advanced reasoning capabilities. According to OpenAI, what sets the o1 apart from the GPT-4o family is that they’re designed to spend more time thinking before they respond. One of the caveats with older and current OpenAI models (e.g. GPT-4o and 4o-mini) is their limited reasoning and contextual awareness capabilities—which lag behind advanced models like Anthropic’s Claude 3.5 Sonnet. OpenAI o1 is designed to help users complete complex tasks and solve harder problems than previous models in science, coding, and math.

This blog explores Open o1’s features, test results, pricing, and comparisons with existing benchmarks, GPT-4o and Claude 3.5 Sonnet (you can compare currently leading models here)

1. Overview of OpenAI o1

OpenAI o1 is a model family designed specifically for advanced reasoning and problem-solving. According to Open AI, the models can perform similarly to PhD students on challenging benchmark tasks in physics, chemistry, and biology. The test results don’t suggest otherwise. Key highlights of the OpenAI o1 models include.

1.1 Performance Metrics

OpenAI o1 ranks in the 89th percentile on competitive programming questions and has shown remarkable results in standardized tests, outperforming human PhD-level accuracy in physics, biology, and chemistry benchmarks. Besides, the model has a 128K context and an October 2023 knowledge cutoff.

1.2 o1 Model Family

The series includes the o1 preview model with a broader world knowledge and reasoning, and a smaller variant, o1-mini, which is faster and more cost-effective, especially for coding tasks. The o1-mini is approximately 80% cheaper than the o1-preview while maintaining competitive performance in coding evaluations.

1.3 Availability of o1 models

The o1 preview models are currently available in ChatGPT Plus (including access for Team, and Enterprise users), as well as via API for developers on tier 5 of API usage. In ChatGPT, it has a strict message limit of only 30 messages per week for the o1 preview and 50 messages per week for the o1 mini, after which you are required to switch to GPT-4o models.

1.4 Pricing for o1 models compared with GPT-4o

OpenAI has structured its pricing to cater to different user needs, with o1-mini being the most economical option. Here’s a breakdown of the pricing for the OpenAI o1 models:

Model             Input Tokens                  Output Tokens
OpenAI o1            $15.00 / 1M $60.00 / 1M
OpenAI o1-mini            $3.00 / 1M $12.00 / 1M
GPT-4o (08-06)            $2.5 / 1M $10.00 / 1M
GPT-4o mini            $0.150 / 1M $0.600 / 1M
Claude 3.5 Sonnet            $3.00 / 1M $15 / 1M

2. Comparison of OpenAI o1 vs GPT 4o

In rigorous testing, OpenAI o1 has demonstrated superior reasoning skills compared to its predecessors. For example, in a qualifying exam for the International Mathematics Olympiad, the o1 model scored 83%, while GPT-4o only managed 13%. Additionally, the o1 model scored significantly higher on jailbreaking tests, indicating a stronger adherence to safety protocols.

Performance Comparison

The below charts (courtesy: OpenAI) provide some interesting details about OpenAI o1’s technical performance across different metrics:

3. Comparison of OpenAI o1 vs Claude 3.5 Sonnet

Here are some quick points highlighting the differences comparing OpenAI o1 with GPT-40 and Claude 3.5 Sonnet:

  • Reasoning Ability: OpenAI o1 outperforms GPT-4o in complex reasoning tasks, as evidenced by its superior scores in competitive programming and math challenges. But the context is still lower than Claude’s most premium plan aka Claude for Enterprise, which has a 500K context window.
  • Safety and Compliance: OpenAI o1 has shown improved performance in safety tests, indicating better adherence to safety protocols compared to GPT-4o and Claude 3.5 Sonnet.

Claude AI also launched its own github integration to ground the responses in your personal data, which is especially helpful for code generation use cases.

4. Conclusion

The introduction of OpenAI o1 marks a significant milestone in AI development, particularly in enhancing reasoning capabilities for complex problem-solving. OpenAI mentioned that they expect to add browsing, file, and image uploading, and other features to make them more useful to everyone. It’ll be interesting to follow along with these developments.  At the same time, it is important to compare models and pick the one which works best for your use case, and the most expensive model isn’t always the best. The best models currently are GPT-4o, Claude 3.5 Sonnet, Llama 3.1 and you can test multiple models and make a decision that works for you.

Meta Launches Llama 3.2, Optimized for Mobile and Edge Devices

Meta releases 'Llama 3.2', with improved image recognition performance and  a smaller version for smartphones - GIGAZINE

Llama 3.2 is a new large language model (LLM) from Meta, designed to be smaller and more lightweight compared to Llama 3.1. It includes a range of models in various sizes, such as small and medium-sized vision models (11B and 90B) and lightweight text models (1B and 3B). The 1B and 3B models are specifically designed for use on edge devices and mobile platforms.

Llama 3.1, which was launched last July, is an open-source model with an extremely large parameter count of 405B, making it difficult to deploy on a large scale for widespread use. This challenge led to the development of Llama 3.2.

Llama 3.2’s 1B and 3B models are lightweight AI models specifically designed for mobile devices, which is why they only support text-based tasks. Larger models, on the other hand, are meant to handle more complex processing on cloud servers. Due to the smaller parameter count, the 1B and 3B models can operate directly on-device, capable of handling up to 128K tokens (approximately 96,240 words) for tasks like text summarization, sentence rewriting, and more. Because the processing occurs on-device, it also ensures enhanced data security, as user data remains on their own devices.

Run Meta Llama 3.2 1B & 3B Models Locally on iOS Devices

Meta’s latest Llama 3.2 models are taking a leap forward in AI technology, especially for mobile and on-device applications. The 1B and 3B models, specifically, are designed to run smoothly on hardware like smartphones or even on SoCs (System on Chips) from Qualcomm, MediaTek, and other ARM-based processors. This opens up new possibilities for bringing advanced AI capabilities directly to your pocket, without needing a powerful server.

Meta revealed that the Llama 3.2 1B and 3B models are actually optimized versions of the larger Llama 3.1 models (8B and 70B). These smaller models are created using a process called “knowledge distillation,” where larger models “teach” the smaller ones. The output of the large models is used as a target during the training of the smaller models. This process adjusts the smaller models’ weights in such a way that they maintain much of the performance of the original larger model. In simple terms, this approach helps the smaller models achieve a higher level of efficiency compared to training them from scratch.

For more complex tasks, Meta has also introduced the larger Llama 3.2 vision models, sized at 11B and 90B. These models not only handle text but also have impressive image-processing capabilities. For example, the mid-sized 11B and 90B models can be applied to tasks like understanding charts and graphs. Businesses can use these models to get deeper insights from sales data, analyzing financial reports, or even automating complex visual tasks that go beyond just text analysis.

With Llama 3.2, Meta is pushing the boundaries of AI, from mobile-optimized, secure, on-device processing to more advanced cloud-based visual intelligence.

In its earlier versions, Llama was primarily focused on processing language (text) data. However, with Llama 3.2, Meta has expanded its capabilities to handle images as well. This transformation required significant architectural changes and the addition of new components to the model. Here’s how Meta made it possible:

1. Introducing an Image Encoder: To enable Llama to process images, Meta added an image encoder to the model. This encoder translates visual data into a form that the language model can understand, effectively bridging the gap between images and text processing.

2. Adding an Adapter: To seamlessly integrate the image encoder with the existing language model, Meta introduced an adapter. This adapter connects the image encoder to the language model using cross-attention layers, which allow the model to combine information from both images and text. Cross-attention helps the model focus on relevant parts of the image while processing related textual information.

3. Training the Adapter: The adapter was trained on paired datasets consisting of images and corresponding text, allowing it to learn how to accurately link visual information to its textual context. This step is crucial for tasks like image captioning, where the model needs to interpret an image and generate a relevant description.

4. Additional Training for Better Visual Understanding: Meta took the model’s training further by feeding it various datasets, including both noisy and high-quality data. This additional training phase ensures that the model becomes proficient at understanding and reasoning about visual content, even in less-than-ideal conditions.

5. Post-Training Optimization: After the training phase, Llama 3.2 underwent optimization using several advanced techniques. One of these involved leveraging synthetic data and a reward model to fine-tune the model’s performance. These strategies help improve the overall quality of the model, allowing it to generate better outputs, especially when dealing with visual information.

With these changes, Meta has evolved Llama from a purely text-based model into a powerful multimodal AI capable of processing both text and images, broadening its potential applications across industries.

When it comes to Llama 3.2’s smaller models, both the 1B and 3B versions show promising results. The Llama 3.2 3B model, in particular, demonstrates impressive performance across a range of tasks, especially on more complex benchmarks such as MMLU, IFEval, GSM8K, and Hellaswag, where it competes favorably against Google’s Gemma 2B IT model.

Even the smaller Llama 3.2 1B model holds its own, showing respectable scores despite its size, which makes it a great option for devices with limited resources. This performance highlights the efficiency of the model, especially for mobile or edge applications where resources are constrained.

Overall, the Llama 3.2 3B model stands out as a small but highly capable language model, with the potential to perform well across a variety of language processing tasks. It’s a testament to how even compact models can achieve excellent results when optimized effectively.