Jaccard vs cityblock
ph
>>> b =np.array([[1,0,1,1,1],[1,0,0,1,1]]).T >>> b array([[1, 1], [0, 0], [1, 0], [1, 1], [1, 1]]) >>> ss.distance.squareform(ss.distance.pdist(b)) array([[ 0. , 1.41421356, 1. , 0. , 0. ], [ 1.41421356, 0. , 1. , 1.41421356, 1.41421356], [ 1. , 1. , 0. , 1. , 1. ], [ 0. , 1.41421356, 1. , 0. , 0. ], [ 0. , 1.41421356, 1. , 0. , 0. ]]) >>> ss.distance.squareform(ss.distance.pdist(b, 'cityblock')) array([[ 0., 2., 1., 0., 0.], [ 2., 0., 1., 2., 2.], [ 1., 1., 0., 1., 1.], [ 0., 2., 1., 0., 0.], [ 0., 2., 1., 0., 0.]]) >>> ss.distance.squareform(ss.distance.pdist(b, 'jaccard')) array([[ 0. , 1. , 0.5, 0. , 0. ], [ 1. , 0. , 1. , 1. , 1. ], [ 0.5, 1. , 0. , 0.5, 0.5], [ 0. , 1. , 0.5, 0. , 0. ], [ 0. , 1. , 0.5, 0. , 0. ]])
jaccard는 둘 다 0일때 무시한다. intersection=and, union=or 이기 때문. jaccard=intersection/union