Git Product home page Git Product logo

interview's Introduction


Interview——IT 行业应试学知识库

程序员的双手是魔术师的双手,他们把枯燥无味的代码变成了丰富多彩的软件。——《疯狂的程序员》

在线阅读

协议

CC BY-NC-SA 4.0

赞助我们

微信&支付宝


interview's People

Contributors

jiangzhonglian avatar wizardforcel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

interview's Issues

项目布局规划

大厂面试: 1面知识面,2面技术深度,3面项目经验,4面职业规划
小厂面试: 1面技能和项目, 2面职业规划

知识面: 基础的算法面试题,技能的知识面
技术深度:基于技能的知识面,会针对具体的几个聊聊深入的底层原理和优化
项目经验:只要是看项目经验中,是否能胜任目前公司招聘人员的要求
职业规划:扯淡为主,毕竟大家都是相互套路一下

后期需要完善

  1. 如何写好一份简历
  2. 如何去投递简历
  3. 如何刷题
  4. 如何刷项目
  5. 面试心得

欢迎补充 ...

数据集特征的问题

请问在构建训练/测试数据集的时候,为什么没有drop PassengerId这一列?这也可以作为特征来用吗?

BAT - 技术面试题目汇总

在 Interview 项目中添加系统设计面试解答

  1. 我之前翻译过一个小册子,可以合并进来:
  2. 一亩三分地有个系统设计版,很多人在里面贴英文的资源,可以翻译
  3. HighScalability.com 是个权威的站点,但是我不知道从哪里下手。

推荐链接:

【收集】特征工程

up主你好,有机会可以讲解一下特征选择过程吗?有关于特征选择过程中特征抽取算法的,比如启发式算法的过程

已经合并更新到: #343

比赛活动 & 负责人征集

如果你有想法,有热情参与某个比赛(或者复现某个现有比赛),但苦于没人一起组队的话,加入我们,成为比赛活动负责人吧!发起你的活动,招募队友,互相学习,争取更大的胜利!

请在这个 ISSUE 中留言,“昵称 + QQ + 比赛名称”,示例:“飞龙+562826179+kaggle Leaf Classification”。

负责人 QQ 比赛名称 备注
张一极 2533524298 手写数字百分百准确率模型探究
呆呆 728634974 ds、kdd相关皆可,个人水平kaggle top20,cv的话小数据集可以,我这里算力有限
Roman 570515024 大数据
1266 1097828409 桑坦德客户交易预测
Datawhale - 搭建文本情感分类模型

比赛平台

Columns and DataType Not Explicitly Set on line 345 of titanic-python3.6.py

Hello!

I found an AI-Specific Code smell in your project.
The smell is called: Columns and DataType Not Explicitly Set

You can find more information about it in this paper: https://dl.acm.org/doi/abs/10.1145/3522664.3528620.

According to the paper, the smell is described as follows:

Problem If the columns are not selected explicitly, it is not easy for developers to know what to expect in the downstream data schema. If the datatype is not set explicitly, it may silently continue the next step even though the input is unexpected, which may cause errors later. The same applies to other data importing scenarios.
Solution It is recommended to set the columns and DataType explicitly in data processing.
Impact Readability

Example:

### Pandas Column Selection
import pandas as pd
df = pd.read_csv('data.csv')
+ df = df[['col1', 'col2', 'col3']]

### Pandas Set DataType
import pandas as pd
- df = pd.read_csv('data.csv')
+ df = pd.read_csv('data.csv', dtype={'col1': 'str', 'col2': 'int', 'col3': 'float'})

You can find the code related to this smell in this link:

pca_tr_data = do_FeatureEngineering(train_data)
pca_te_data = do_FeatureEngineering(test_data)
# 3. 模型训练/模型融合(分类问题: lr、rf、adboost、xgboost、lightgbm)
model = trainModel(pca_tr_data, train_label)
model.fit(pca_tr_data, train_label)
labels = model.predict(pca_te_data)
# 4. 数据导出
print(type(pids), type(labels.tolist()))
result = pd.DataFrame({
'PassengerId': pids,
'Survived': [int(i) for i in labels.tolist()]
})
result.to_csv('Result_titanic.csv', index=False)
# 结束时间
end_time = datetime.datetime.now()
times = (end_time - sta_time).seconds
print("\n运行时间: %ss == %sm == %sh\n\n" % (times, times/60, times/60/60))
.

I also found instances of this smell in other files, such as:

File: https://github.com/apachecn/Interview/blob/master/src/py3.x/kaggle/getting-started/digit-recognizer/cnn_pytorch-python3.6.py#L24-L34 Line: 29
File: https://github.com/apachecn/Interview/blob/master/src/py3.x/kaggle/getting-started/digit-recognizer/cnn_pytorch-python3.6.py#L34-L44 Line: 39
File: https://github.com/apachecn/Interview/blob/master/src/py3.x/kaggle/getting-started/digit-recognizer/knn-python3.6.py#L19-L29 Line: 24
File: https://github.com/apachecn/Interview/blob/master/src/py3.x/kaggle/getting-started/digit-recognizer/knn-python3.6.py#L20-L30 Line: 25
File: https://github.com/apachecn/Interview/blob/master/src/py3.x/kaggle/getting-started/digit-recognizer/rf-python3.6.py#L23-L33 Line: 28
.

I hope this information is helpful!

【职场晋升】管理书籍推荐

经济类型的书目:

  • 金字塔原理(麦肯锡40年经典培训教材)
  • 市场研究实务 (历次研究公司入职考核书目)
  • 品牌知行 (品牌三部曲)
  • 决战大数据(阿里大数据之父)

管理思维:

  • The Five Dysfunctions of a Team by Patrick Lencioni (团队协作的五大障碍)
  • Four Disciplines of Execution, by Stephen Covey (高效能人士的执行4原则)
  • Non-Violent Communication, by Rosenberg (非暴力沟通)
  • The Checklist Manifesto, by Atul Gawande (清单革命)
  • The Ideal Teamplayer, by Patrick Lencioni (理想的团队成员)
  • Learned Optimism by Martin Seligman (活出最乐观的自己)
  • Built to Last by Jim Collins (基业长青)
  • The Fifth Discipline(第五项修炼)

图片数据处理方法的讨论

可以尝试图片处理中缩放,来将原始图片缩小,较少特征。
之后进行锐化,使图片更加清晰,特征更突出。

同学,学成了就来阿里天猫国际试试吧,实习社招都可以

天猫国际是**消费升级的第一跨境平台,是阿里经济体5年2000亿美金进口承诺的主力军。2019年天猫国际技术部和考拉合并成立了阿里巴巴大进口技术部,是阿里巴巴国际化战略的核心技术部门。致力于进口业务的技术突破和创新,助力**的消费者实现“买遍全球”的需求,跨入未来的万亿级市场。想了解更多的话,直接联系我吧,我帮你组内直推,大量HC,走过路过不可错过。
邮箱:[email protected]
微信:isHunterZhang

【收集】比赛项目+特征工程

  1. 多标签(multi_label)分类
  1. 时间序列问题(单变量+多变量)

【收集】实时流

【收集】数据采集

【收集】特征工程

数据处理

在每个样本上减去数据的统计平均值可以移除共同的部分,凸显个体差异。

使用sklearn做单机特征工程

特征工程系列

混淆矩阵及confusion_matrix函数的使用

sklearn 网格搜索 - 得到最优参数

house price 模型教程问题

kaggle/competitions/getting-started/house-price/里面readme.md中的rmsle_cv函数是什么?我按照代码敲进去发现报错,而且这个里面好多函数的都没有import,例如:

from sklearn.preprocessing import RobustScaler
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import Lasso, ElasticNet
from sklearn.kernel_ridge import KernelRidge
from sklearn.ensemble import GradientBoostingRegressor
import xgboost as xgb
import lightgbm as lgb

svm-python3.6.py saveResult函数 输出结果多了一行空格

open的时候加newline=''

def saveResult(result, csvName):
with open(csvName, 'w',newline='') as myFile:
myWriter = csv.writer(myFile)
myWriter.writerow(["ImageId", "Label"])
index = 0
for r in result:
index += 1
myWriter.writerow([index, int(r)])
print('Saved successfully...') # 保存预测结果

【收集】面试专业名词

业务建模:
数据建模:使用数据建模技术来分析数据对象,以此洞悉数据的内在涵义。

统计分析:
对比分析:
聚类分析:它是将相似的对象聚合在一起,每类相似的对象组合成一个聚类(也叫作簇)的过程。这种分析方法的目的在于分析数据间的差异和相似性。
回归分析:
判别分析:
相关性分析:
相关系数:是研究变量之间线性相关程度的量(较为常用的是皮尔森相关系数)

异常检测:在数据集中搜索与预期模式或行为不匹配的数据项。

数据采样:
数据增强:

特征选择:

特征工程:
数据清洗:对数据进行重新审查和校验的过程,目的在于删除重复信息、纠正存在的错误,并提供数据一致性。
降维:

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.