js develops an AI application assistant similar to ChatGPT

"You think we can develop an application similar to ChatGPT?" Last month, an entrepreneurial friend came to me and wanted to be an AI assistant in a vertical field. As a full-stack developer who often deals with AI APIs, this idea immediately aroused my interest. But to be honest, building an AI application from scratch still made me a little nervous.

After a month of development iteration, we successfully launched the first version, and the user feedback was surprisingly good. Today I will share the technical selection, architectural design and practical experience in this process.

Technical selection

The first thing we face is the choice of the technology stack. Taking into account real-time, performance and development efficiency, we finally selected this technology stack:

// Project technology stackconst techStack = {
  frontend: {
    framework: ' 14', // App Router + React Server Components
    ui: 'Tailwind CSS + Shadcn UI',
    state: 'Zustand',
    realtime: 'Server-Sent Events'
  },
  backend: {
    runtime: '',
    framework: ' API Routes',
    database: 'PostgreSQL + Prisma',
    cache: 'Redis'
  },
  ai: {
    provider: 'OpenAI API',
    framework: 'Langchain',
    vectorStore: 'PineconeDB'
  }
}

Core function implementation

1. Implementation of streaming response

The most important thing is to achieve streaming response of typewriter effects:

// app/api/chat/
import { OpenAIStream } from '@/lib/openai'
import { StreamingTextResponse } from 'ai'

export async function POST(req: Request) {
  const { messages } = await ()

  // Call OpenAI API to get streaming response  const stream = await OpenAIStream({
    model: 'gpt-4',
    messages,
    temperature: 0.7,
    stream: true
  })

  // Return to streaming response  return new StreamingTextResponse(stream)
}

// components/
function Chat() {
  const [messages, setMessages] = useState&lt;Message[]&gt;([])
  const [isLoading, setIsLoading] = useState(false)

  const handleSubmit = async (content: string) =&gt; {
    setIsLoading(true)

    try {
      const response = await fetch('/api/chat', {
        method: 'POST',
        headers: { 'Content-Type': 'application/json' },
        body: ({
          messages: [...messages, { role: 'user', content }]
        })
      })

      if (!) throw new Error('Request failed')

      // Handle streaming response      const reader = !.getReader()
      const decoder = new TextDecoder()
      let aiResponse = ''

      while (true) {
        const { done, value } = await ()
        if (done) break

        // Decode and add new content        aiResponse += (value)
        // Update the UI        setMessages(prev =&gt; [...(0, -1), { role: 'assistant', content: aiResponse }])
      }
    } catch (error) {
      ('There was an error in chat:', error)
    } finally {
      setIsLoading(false)
    }
  }

  return (
    &lt;div className='flex flex-col h-screen'&gt;
      &lt;div className='flex-1 overflow-auto p-4'&gt;
        {((message, index) =&gt; (
          &lt;Message key={index} {...message} /&gt;
        ))}
        {isLoading &amp;&amp; &lt;TypingIndicator /&gt;}
      &lt;/div&gt;
      &lt;ChatInput onSubmit={handleSubmit} disabled={isLoading} /&gt;
    &lt;/div&gt;
  )
}

2. Context Memory System

In order to make the dialogue more coherent, we implement a context memory system based on vector database:

// lib/
import { PineconeClient } from '@pinecone-database/pinecone'
import { OpenAIEmbeddings } from 'langchain/embeddings/openai'

export class VectorStore {
  private pinecone: PineconeClient
  private embeddings: OpenAIEmbeddings

  constructor() {
     = new PineconeClient()
     = new OpenAIEmbeddings()
  }

  async initialize() {
    await ({
      environment: .PINECONE_ENV!,
      apiKey: .PINECONE_API_KEY!
    })
  }

  async storeConversation(messages: Message[]) {
    const index = ('conversations')

    // Convert conversations to vectors    const vectors = await (
      (async message =&gt; {
        const vector = await ()
        return {
          id: ,
          values: vector,
          metadata: {
            role: ,
            timestamp: ()
          }
        }
      })
    )

    //Storage vectors    await ({
      upsertRequest: {
        vectors
      }
    })
  }

  async retrieveContext(query: string, limit = 5) {
    const index = ('conversations')
    const queryVector = await (query)

    // Query similar vectors    const results = await ({
      queryRequest: {
        vector: queryVector,
        topK: limit,
        includeMetadata: true
      }
    })

    return (match =&gt; ({
      content: ,
      score: 
    }))
  }
}

3. Prompt word optimization

Good prompt words are crucial to AI output quality:

// lib/
export const createChatPrompt = (context: string, query: string) =&gt; ({
  messages: [
    {
      role: 'system',
      content: `You are a professionalAIassistant。Please use the following context information,
        Answer user questions in concise and professional language。If the problem is out of context,
        Please let me know honestly。

        Context information:
        ${context}
        `
    },
    {
      role: 'user',
      content: query
    }
  ],
  temperature: 0.7, // Control creativity  max_tokens: 1000, // Control the length of the answer  presence_penalty: 0.6, // Encourage topic expansion  frequency_penalty: 0.5 // Avoid duplication})

Performance optimization

Performance optimization of AI applications mainly starts from these aspects:

Request optimization

// hooks/
export function useChat() {
  const [messages, setMessages] = useState&lt;Message[]&gt;([])

  // Use anti-shake to avoid frequent requests  const debouncedChat = useMemo(
    () =&gt;
      debounce(async (content: string) =&gt; {
        // ... Send a request      }, 500),
    []
  )

  // Use cache to avoid duplicate requests  const cache = useMemo(() =&gt; new Map&lt;string, string&gt;(), [])

  const sendMessage = async (content: string) =&gt; {
    // Check the cache    if ((content)) {
      setMessages(prev =&gt; [...prev, { role: 'assistant', content: (content)! }])
      return
    }

    // Send a request    await debouncedChat(content)
  }

  return { messages, sendMessage }
}

Streaming Optimization:

// lib/
export class StreamProcessor {
  private buffer: string = ''
  private decoder = new TextDecoder()

  process(chunk: Uint8Array, callback: (text: string) =&gt; void) {
     += (chunk, { stream: true })

    // Process it according to the complete sentence    const sentences = (/([.!?。！？]\s)/)

    if ( &gt; 1) {
      // Output the complete sentence      const completeText = (0, -1).join('')
      callback(completeText)

      // Keep the unfinished part       = sentences[ - 1]
    }
  }
}

Deployment and monitoring

We used Vercel for deployment and established a complete monitoring system:

// lib/
export class AIMonitoring {
  // Record request delay  async trackLatency(startTime: number) {
    const duration = () - startTime
    await ('ai_request_latency', duration)
  }

  // Use of monitoring tokens  async trackTokenUsage(prompt: string, response: string) {
    const tokenCount = await (prompt + response)
    await ('token_usage', tokenCount)
  }

  // Monitor error rate  async trackError(error: Error) {
    await ('ai_errors', 1, {
      type: ,
      message: 
    })
  }
}

Practical experience

I learned a lot in the process of developing AI applications:

Streaming response is the key to improving user experience
Context management balances accuracy and performance
Error handling and downgrade strategies are important
Continuous optimization of prompt words can bring significant improvements

What surprised me the most was the feedback from users. Some users said, "This is the fastest responsive AI application I have ever used!" This inspired us.

Written at the end

AI application development is a challenging but also full of opportunities. The key is to focus on user experience and continuously optimize and iterate. As the saying goes, "AI is not magic, but engineering."

This is the article about developing an AI application assistant similar to ChatGPT in js. For more related content on developing an AI application assistant, please search for my previous articles or continue browsing the related articles below. I hope everyone will support me in the future!