num | 易学教程

Word2vec

阅读更多关于 Word2vec

在之前的学习中提到过one-hot向量表示方式，虽然它们构造起来很容易，但通常并不是一个好选择。一个主要的原因是，one-hot 词向量无法准确表达不同词之间的相似度，如我们常常使用的余弦相似度。例如 beautiful和lovely是两个表示含义相近的词，他们应该在向量空间上有一定的相似度，而相对的，ugly这样的词距离他们就应该很远，Word2Vec 词嵌入工具的提出正是为了解决上面这个问题，它将每个词表示成一个定长的向量，并通过在语料库上的预训练使得这些向量能较好地表达不同词之间的相似和类比关系，以引入一定的语义信息。基于两种概率模型的假设，我们可以定义两种 Word2Vec 模型 Skip-Gram 跳字模型假设背景词由中心词生成，即建模 P(wo∣wc)，其中 wc 为中心词，wo 为任一背景词； CBOW (continuous bag-of-words) 连续词袋模型假设中心词由背景词生成，即建模 P(wc∣Wo)，其中 Wo 为背景词的集合。在这里我们主要介绍 Skip-Gram 模型的实现，CBOW 实现与其类似，读者可之后自己尝试实现。后续的内容将大致从以下四个部分展开： PTB 数据集 Skip-Gram 跳字模型负采样近似训练模型 PTB数据集为了训练 Word2Vec 模型，我们就需要一个自然语言语料库，模型将从中学习各个单词间的关系

ES6语法

阅读更多关于 ES6语法

ES6语法 let 和 const 关键字我们以前都是使用 var 关键字来声明变量的在 ES6 的时候，多了两个关键字 let 和 const ，也是用来声明变量的只不过和 var 有一些区别 1. let 和 const 不允许重复声明变量 // 使用 var 的时候重复声明变量是没问题的，只不过就是后的会把前面覆盖掉 var num = 100 var num = 200 2、let和const声明的变量不会进行声明提升。（通过let声明变量，之前的区域，叫做暂时性死区） 3、let和const声明的变量会被所有的代码块限制作用域（作用域更小），只要遇到大括号就形成作用域。 let num; alert(num); //undefined const num2 = 20; alert(num2); // 使用const 重复声明变量的时候就会报错 const num = 100 const num = 200 // 这里就会报错了 ii. let 和 const 声明的变量不会在预解析的时候解析（也就是没有变量提升） alert(num); const num = 10; alert(num);` iii. let 和 const 声明的变量会被所有代码块限制作用范围 // var 声明的变量只有函数能限制其作用域，其他的不能限制 if (true) { var

30种SQL语句优化

阅读更多关于 30种SQL语句优化

01 对查询进行优化，应尽量避免全表扫描，首先应考虑在 where 及 order by 涉及的列上建立索引。 02 应尽量避免在 where 子句中使用!=或<>操作符，否则将引擎放弃使用索引而进行全表扫描。 03 应尽量避免在 where 子句中对字段进行 null 值判断，否则将导致引擎放弃使用索引而进行全表扫描，如： select id from t where num is null 可以在num上设置默认值0，确保表中num列没有null值，然后这样查询： select id from t where num=0 04 应尽量避免在 where 子句中使用 or 来连接条件，否则将导致引擎放弃使用索引而进行全表扫描，如： select id from t where num=10 or num=20 可以这样查询： select id from t where num=10union allselect id from t where num=20 05 下面的查询也将导致全表扫描： select id from t where name like '%abc%' 若要提高效率，可以考虑全文检索。 06 in 和 not in 也要慎用，否则会导致全表扫描，如： select id from t where num in(1,2,3) 对于连续的数值，能用

线程池代码

阅读更多关于线程池代码

/*** threadpool.h ***/ #ifndef __THREADPOOL_H_ #define __THREADPOOL_H_ typedef struct threadpool_t threadpool_t; /** * @function threadpool_create * @descCreates a threadpool_t object. * @param thr_num thread num * @param max_thr_num max thread size * @param queue_max_size size of the queue. * @return a newly created thread pool or NULL */ threadpool_t *threadpool_create(int min_thr_num, int max_thr_num, int queue_max_size); /** * @function threadpool_add * @desc add a new task in the queue of a thread pool * @param pool Thread pool to which add the task. * @param function Pointer to the

【C语言】简单片段输入多行

阅读更多关于【C语言】简单片段输入多行

第一行输入数字：n表示将输入n行句子后续每行都输入句子 #include <stdlib.h> #include <stdio.h> #include <string.h> #define SLEN 128 int getStrings(char * str) { int num = 0; char c; do{ scanf("%c", &c); str[ num ++ ] = c; }while(c != '\n' && num < SLEN); str[num - 1] = '\0'; return num - 1; } int main(int argc, char ** argv) { int i = 0, j = 0; int num = 0; char **pStr; char tmp[SLEN] ; char cEnt; scanf("%d", &num); //第一行输入num行句子 scanf("%c", &cEnt); //回车键 //分配存储 num行句子的二维数组 pStr = (char **)malloc(num*sizeof(char *)); for(i = 0; i < num ; i++) { pStr[i] = (char *)malloc(SLEN*sizeof(char)); memset(pStr[i], 0, SLEN);

P3292 [SCOI2016]幸运数字

阅读更多关于 P3292 [SCOI2016]幸运数字

题目链接题意分析一句话题意 : 树上一条链中挑选出某些数异或和最大我们可以考虑维护一个树上倍增线性基然后倍增的时候维护一个线性基合并就可以了写起来还是比较容易的 CODE: #include<iostream> #include<cstdio> #include<cstring> #include<cmath> #include<algorithm> #include<cstdlib> #include<string> #include<queue> #include<map> #include<stack> #include<list> #include<set> #include<deque> #include<vector> #include<ctime> #define ll long long #define inf 0x7fffffff #define N 200008 #define IL inline #define M 66 #define D double #define ull unsigned long long #define R register using namespace std; template<typename T>IL void read(T &_) { T __=0,___=1;char ____=getchar();

线程池的简单实现

阅读更多关于线程池的简单实现

/*pthreadpool.h*/ #ifndef PTHREADPOOL 2 #define PTHREADPOOL 3 #include<pthread.h> 4 5 struct task_node{ 6 7 void *(*run)(void *arg);//一个函数指针 8 void *arg;//函数指针的参数 9 struct task_node *next; 10 }; 11 12 typedef struct task_node task_node; 13 14 struct pthreadpool{ 15 task_node *head;//任务队列的头指针 16 task_node *top;//任务队列的尾指针 17 18 int max_pthread;//线程池创建线程的最大数 19 20 int pthread_num;//线程的数目 21 int pthread_wait_num;//等待线程的数目 22 int pthread_work_num;//工作线程的数目 23 24 pthread_mutex_t mutex;// 用于保护任务队列的互斥锁 25 pthread_cond_t cond;// 一个条件变量 26 }; 27 typedef struct pthreadpool pthreadpool; 28 29 //线程池的初始化 30

PAT Advanced 1034 Head of a Gang (30) [图的遍历，BFS，DFS，并查集]

阅读更多关于 PAT Advanced 1034 Head of a Gang (30) [图的遍历，BFS，DFS，并查集]

题目 One way that the police finds the head of a gang is to check people’s phone calls. If there is a phone call between A and B, we say that A and B is related. The weight of a relation is defined to be the total time length of all the phone calls made between the two persons. A “Gang” is a cluster of more than 2 persons who are related to each other with total relation weight being greater than a given threshold K. In each gang, the one with maximum total weight is the head. Now given a list of phone calls, you are supposed to find the gangs and the heads. Input Specification: Each input file

筛法求素数的几个模板

阅读更多关于筛法求素数的几个模板

定义法素数可以由定义法求出，即遍历2到sqrt(x)中是否存在能整除x的数，如果存在则不是素数，如果不存在，则是素数，复杂度是O(n)。在数据量小的时候可以使用。 bool isprime(int x) // 判断素数 { if ( x<=1 ) return false; for (int i = 2; i*i<=x; i++) if (x%i==0) return false; return true; } 一般线性筛法当数量级变大时，如果要找出某个范围内的素数，那么时间复杂度很容易过不去。因此考虑这样一个命题：若一个数不是素数，则必然存在一个小于它的素数作为其因数。该命题很容易证明其正确性所以我们假设已经获得小于一个数x的所有素数，那么判断时就只需要看x是否能被这些素数整除，来判断x是不是素数。但这样依然需要大量的枚举测试工作，因此换一个角度：当获得一个素数时，即将他所有的倍数均标记为非素数。这样一来，遍历是，如果这个数没有被标记，就说明它就无法被小于它的书整除，则可以认定为素数。 #define MAXSIZE 10001 int Mark[MAXSIZE]; int prime[MAXSIZE]; //判断是否是一个素数 Mark 标记数组 index 素数个数 int Prime(){ int index = 0; memset(Mark,0,sizeof

leetcode1356

阅读更多关于 leetcode1356

1 import collections 2 class Solution: 3 def countBitOnes(self,num): 4 count = 0 5 for i in range(14): 6 if num & 1 == 1: 7 count += 1 8 num >>= 1 9 return count 10 11 def sortByBits(self, arr: 'List[int]') -> 'List[int]': 12 dic = collections.OrderedDict() 13 for num in arr: 14 count = self.countBitOnes(num) 15 #print(count) 16 if count not in dic: 17 dic[count] = [num] 18 else: 19 dic[count].append(num) 20 result = [] 21 sort_dic = sorted(dic.items(),key=lambda d:d[0]) 22 # print(sort_dic) 23 for k,v in sort_dic: 24 result += sorted(v) 25 return result 算法思路：位运算。题目前提条件0 <= arr[i] <= 10^4

订阅 num