search | 易学教程

Django Haystack indexing is not working for many to many field in model

阅读更多关于 Django Haystack indexing is not working for many to many field in model

问题 I am using haystack in our django application for search and search is working very fine. But I am having an issue with reamtime search. For realtime search I am using haystack's default RealTimeSignalProcessor(haystack.signals.RealtimeSignalProcessor). My model contains one many to many field in it. When data is changed for this many to many field only, it seems the realtimesignal processor is not updating indexing data properly. After updating the many to many data, I am getting wrong

Solr accent removal

阅读更多关于 Solr accent removal

问题 i have read various threads about how to remove accents during index/query time. The current fieldtype i have come up with looks like the following: <fieldType name="text_general" class="solr.TextField"> <analyzer> <tokenizer class="solr.StandardTokenizerFactory"/> <filter class="solr.ASCIIFoldingFilterFactory"/> <filter class="solr.LowerCaseFilterFactory" /> </analyzer> </fieldType> After having added a couple of test information to index i have checked via http://localhost:8080/solr/test

Detecting duplicate lines on file using c

阅读更多关于 Detecting duplicate lines on file using c

问题 I have a csv file with about (15000-25000) lines(of fixed size) and i want to know how can i detect duplicated lines using c language. An example of the output is like this : 0123456789;CUST098WZAX;35 I have no memory or time constraint, so i want the simplest solution. Thanks for your help. 回答1: #include <stdio.h> #include <stdlib.h> #include <string.h> struct somehash { struct somehash *next; unsigned hash; char *mem; }; #define THE_SIZE 100000 struct somehash *table[THE_SIZE] = { NULL,};

PHP search script for mySQL database, only 3 letter working

阅读更多关于 PHP search script for mySQL database, only 3 letter working

问题 I am trying to do a php search into mySQL database. the following code works funny, it detect very well when I only entered 3 letter..eg i have a product name 'deepbluehealth omega' if i type 'ome' it picked up, if i type 'ega' it picked up, if i type 'omega' no result shown, also if i type 'deepbluehealth' it pick up no problem. <?php error_reporting(E_ALL); ini_set('display_errors', '1'); $search_output = ""; if(isset($_POST['searchquery']) && $_POST['searchquery'] != ""){ $searchquery = $

Postgres full text search: how to search multiple words in multiple fields?

阅读更多关于 Postgres full text search: how to search multiple words in multiple fields?

问题 i'm using for the first time Postgresql and i'm trying to create a search engine in my website. i have this table: CREATE TABLE shop ( id SERIAL PRIMARY KEY, name TEXT NOT NULL, description TEXT, address TEXT NOT NULL, city TEXT NOT NULL ); Then i created an index for every field of the table (is this the right way? Or maybe i can create one index for all fields?): CREATE INDEX shop_name_fts ON shop USING gin(to_tsvector('italian', name)); CREATE INDEX shop_desc_fts ON shop USING gin(to

PySpark: Search For substrings in text and subset dataframe

阅读更多关于 PySpark: Search For substrings in text and subset dataframe

问题 I am brand new to pyspark and want to translate my existing pandas / python code to PySpark . I want to subset my dataframe so that only rows that contain specific key words I'm looking for in 'original_problem' field is returned. Below is the Python code I tried in PySpark: def pilot_discrep(input_file): df = input_file searchfor = ['cat', 'dog', 'frog', 'fleece'] df = df[df['original_problem'].str.contains('|'.join(searchfor))] return df When I try to run the above, I get the following

Python: Finding partial string matches in a large corpus of strings

阅读更多关于 Python: Finding partial string matches in a large corpus of strings

问题 I'm interested in implementing autocomplete in Python. For example, as the user types in a string, I'd like to show the subset of files on disk whose names start with that string. What's an efficient algorithm for finding strings that match some condition in a large corpus (say a few hundred thousand strings)? Something like: matches = [s for s in allfiles if s.startswith(input)] I'd like to have the condition be flexible; eg. instead of a strict startswith, it'd be a match so long as all

PL / SQL to search a string in whole database

阅读更多关于 PL / SQL to search a string in whole database

问题 More than a question, its an information sharing post. I have come across a situation today where i needed to look for a sting in the entire database of an application with no idea of, which table/column it belongs to. Below is a PL/SQL block i wrote and used to help my propose. Hope its helps others to with a similar requirement. Declare i NUMBER := 0; counter_intable NUMBER :=0; BEGIN FOR rec IN ( select 'select count(*) ' || ' from '||table_name|| ' where '||column_name||' like''%732-851%'

Python: Finding partial string matches in a large corpus of strings

阅读更多关于 Python: Finding partial string matches in a large corpus of strings

Case Insensitive hash (SHA) of a string

阅读更多关于 Case Insensitive hash (SHA) of a string

问题 I’m passing a name string and its SHA1 value into a database. The SHA value is used as an index for searches. After the implementation was done, we got the requirement to make searching the name case insensitive. We do need to take all languages into account (Chinese characters are a real use case). I know about the Turkey Test. How can I transform my input string before hashing to be case insensitive? Ideally I’d like it to be equivalent of InvariantCultureIgnoreCase. In other words, how do