Jaccard-Text-Similarity-1

Fri 14 November 2025

title: "Jaccard Text Similarity 1" author: "Rj" date: 2019-04-21 description: "-" type: technical_note draft: false


def get_jaccard_similarity(content1, content2): 

    content1_similarity = set(content1.split()) 
    content2_similarity = set(content2.split())


    intersection = content1_similarity.intersection(content2_similarity)

    return float(len(intersection)) / (len(content1_similarity) + len(content2_similarity) - len(intersection))
# test 2  
content1 = "These data could show that the people …

Category: textprocessing

Read More

Json Output Parser

Fri 14 November 2025
from constants import OPENAI_API_KEY
!pip show langchain-openai | grep "Version:"
Version: 0.2.9
import os
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY
from langchain_openai import ChatOpenAI

model = ChatOpenAI(model="gpt-4o-mini")
from langchain_core.output_parsers import JsonOutputParser

chain = (
    model | JsonOutputParser()
)  # Due to a bug in older versions of Langchain, JsonOutputParser did not stream results …

Category: langchain

Read More

Json-To-Xml

Fri 14 November 2025

title: "JSON to XML" author: "Rj" date: 2019-04-20 description: "List Test" type: technical_note draft: false


import xmltodict
content = {
  "note" : {
    "to" : "Tove",
    "from" : "Jani",
    "heading" : "Reminder",
    "body" : "Don't forget me this weekend!"
  }
}
content
{'note': {'body': "Don't forget me this weekend!",
  'from': 'Jani',
  'heading': 'Reminder',
  'to': 'Tove'}}
xml = xmltodict.unparse …

Category: basics

Read More

Kenlm-Sample

Fri 14 November 2025

title: "Kenlm Sample" author: "Rj" date: 2019-04-21 description: "-" type: technical_note draft: false


import kenlm
model = kenlm.Model('test.arpa')
print(model.score('this is a sentence .', bos = True, eos = True))
-49.579345703125



Score: 5

Category: textprocessing

Read More

Lambda-Custom

Fri 14 November 2025

title: "Lambda Custom" author: "Raja CSP Raman" date: 2019-04-20 description: "-" type: technical_note draft: false


cities = [
    'chennai', 'delhi', 'madurai', 'pune', 'bengaluru'
    ]
cities
['chennai', 'delhi', 'madurai', 'pune', 'bengaluru']
def is_south_indian_city(city):
    if city == 'chennai' or city == 'madurai' or city == 'bengaluru':
        return True

    return False
new_list = list(filter(lambda x: is_south_indian_city(x) , cities …

Category: basics

Read More

Lambda-Odd

Fri 14 November 2025

title: "Lambda Odd" author: "Raja CSP Raman" date: 2019-04-20 description: "-" type: technical_note draft: false


def is_odd(digit):
    if digit % 2 != 0:
        return True

    return False 
digits = [
    1, 7, 18, 2, 4, 2, 8, 5, 3
    ]
digits
[1, 7, 18, 2, 4, 2, 8, 5, 3]
new_list = list(filter(lambda x …

Category: basics

Read More

Lambda-Sqaure

Fri 14 November 2025

title: "Lambda Square" author: "Raja CSP Raman" date: 2019-04-20 description: "-" type: technical_note draft: false


squares = list(map(lambda x: x**2, range(10)))
squares
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]


Score: 0

Category: basics

Read More

Lancaster-Stemmer

Fri 14 November 2025

title: "Lancaster Stemmer" author: "Rj" date: 2019-04-21 description: "-" type: technical_note draft: false


from nltk.stem import LancasterStemmer
l_stemmer = LancasterStemmer()
print(l_stemmer.stem("hunting"))
hunt
print(l_stemmer.stem("hunting"))
hunt
words = [
    "hunting",
    "bunnies",
    "flies"
]
result = [l_stemmer.stem(word) for word in words]
result
['hunt', 'bunny', 'fli']

Score: 5

Category: textprocessing

Read More

Last-N-Rows

Fri 14 November 2025

title: "Last N Rows" author: "Rj" date: 2019-04-22 description: "-" type: technical_note draft: false


import numpy as np
import pandas as pd
df = pd.read_csv('data1.csv')
df
capacity score length
0 1 …

Category: data-wrangling

Read More

Last-Nth-To-First-Row

Fri 14 November 2025

title: "Last nth to last second" author: "Rj" date: 2019-04-22 description: "-" type: technical_note draft: false


import numpy as np
import pandas as pd
df = pd.read_csv('data1.csv')
df
capacity score length …

Category: data-wrangling

Read More
Page 111 of 146

« Prev Next »